Publisher · verified by editors

MarkTechPost

AI news source. Articles are auto-selected and adapted by Hamidun News editors.

290 articles in Hamidun·Latest: July 17· Active·marktechpost.com ↗

Latest publications

Meta Unveils Astryx — A React Design System with AI Agent and MCP Support

Meta has released Astryx, an open-source React and StyleX design system with an MCP server and CLI, where engineers and AI agents work with the same components via a unified API. *Meta is recognized as an extremist organ

Jun 28, 2026·2 min

LLMMarkTechPost

Liquid AI released LFM2.5-230M: 213 tokens/s on Galaxy S25 and support for llama.cpp

Liquid AI has released LFM2.5-230M, an open 230M-parameter model that runs offline on a smartphone and Raspberry Pi, outperforming instruction-following rivals four times larger.

Jun 28, 2026·2 min

LLMMarkTechPost

Datalab Releases lift — a 9B Open-Weight Model for Extracting JSON from PDFs

Datalab has open-sourced lift, a 9-billion-parameter vision model that converts PDFs and images into structured JSON according to a given schema, achieving 90.2% field accuracy on a 225-document benchmark.

Jun 28, 2026·3 min

LLMMarkTechPost

Z.ai releases GLM-5.2: real million tokens and two levels of deep thinking

Z.ai launched GLM-5.2 with a truly working context window of 1 million tokens, High and Max modes, and support for Claude Code, Cline, and OpenClaw via Anthropic-compatible API.

Jun 15, 2026·2 min

LLMMarkTechPost

FineWeb without downloading terabytes: streaming, filtering, and tokenization of web corpus for LLM

A practical guide to FineWeb from Hugging Face: how to work with a multi-terabyte web corpus for LLM training through streaming, filtering, deduplication, and tokenization — without fully downloading the data.

Jun 15, 2026·3 min

LLMMarkTechPost

Zyphra Released Zamba2-VL: Visual Models with 10x Faster Response

Zyphra released Zamba2-VL — open VLMs with 1.2B, 2.7B and 7B parameters with a hybrid Mamba2 + Transformer architecture that reduces time to first token by approximately 10x.

Jun 15, 2026·2 min

LLMMarkTechPost

Moonshot AI Launches Kimi Work — Desktop Agent with Swarm of 300 Sub-Agents

Kimi Work from Moonshot AI runs locally on macOS and Windows, controls the browser via WebBridge, and coordinates a swarm of up to 300 parallel sub-agents.

Jun 15, 2026·2 min

LLMMarkTechPost

Claude Code 2026: Complete Breakdown of 25 Agentic Tool Features with Examples

MarkTechPost released a guide to Claude Code 2026 with a breakdown of 25 features — from CLAUDE.md and skills to MCP, hooks and Auto Mode — complete with code examples and demo.

Jun 15, 2026·2 min

LLMMarkTechPost

Google releases Gemini-SQL2: Gemini 3.1 Pro scores 80% on BIRD benchmark

Google Research introduced Gemini-SQL2 based on Gemini 3.1 Pro — the model achieved 80.04% accuracy on the BIRD benchmark for text-to-SQL conversion.

Jun 15, 2026·3 min

LLMMarkTechPost

How to Build an Agent Workspace on QwenPaw with Custom Skills and Streaming API

Step-by-step tutorial for deploying QwenPaw — a framework for creating AI agents with support for multiple model providers, custom skills, and streaming API in Google Colab.

Jun 15, 2026·3 min

LLMMarkTechPost

Moonshot AI Releases Kimi K2.7-Code: 21.8% Improvement on Code Bench v2 over K2.6

Moonshot AI open-sourced Kimi K2.7-Code — an agentic coding model with 256K context and 30% lower reasoning token consumption compared to K2.6.

Jun 15, 2026·3 min

LLMMarkTechPost

2026 TTS Models Comparison: From Commercial to Open Source

In 2026, choosing a TTS model depends on three factors: sound quality, processing latency, and cost. Commercial solutions win on naturalness, open models on control and affordability.

May 31, 2026·3 min

LLMMarkTechPost

StepFun unveils Step 3.7 Flash — a 198-billion-parameter Vision-Language model

StepFun has released Step 3.7 Flash, a new multimodal model with 198 billion parameters, built-in vision, a 256K-token context window, and Advisor mode for coding agents.

May 31, 2026·3 min

LLMMarkTechPost

NVIDIA X-Token: distillation that beats GOLD by 3.82 points

NVIDIA has released X-Token, a knowledge distillation method for small models (Llama-3.2-1B) that outperforms GOLD by 3.82 points and improves math accuracy from 2.56 to 15.54%.

May 31, 2026·2 min

LLMMarkTechPost

AgentTrove: how to use the 1.7M agent trace dataset in Python

AgentTrove is the largest open dataset of agent interaction traces: 1.7 million examples in ShareGPT format. A Python tutorial shows how to stream the data, normalize agent actions, and prepare the dataset for model fine

May 31, 2026·2 min

LLMMarkTechPost

Nous Research released Tool Search for Hermes Agent: accuracy improved by 49–74% with Opus 4

Nous Research solved the problem of context bloat in MCP by adding smart tool search. The system selects only relevant schemas and improves accuracy by tens of percent when working with Opus 4.

May 31, 2026·2 min

LLMMarkTechPost

Genesis AI Released Genesis World 1.0 — a Platform for Robot Evaluation 400 Times Faster

Genesis AI released the Genesis World 1.0 platform for robot simulation, which reduces evaluation time from 200 hours to 30 minutes and matches real robot behavior by 90%.

May 31, 2026·2 min

LLMMarkTechPost

NVIDIA Releases Polar — Framework for Training Code Agents

NVIDIA introduced Polar, a framework for training language agents using reinforcement learning, which improved SWE-Bench performance by 22.6 points in the Codex environment.

May 29, 2026·2 min

LLMMarkTechPost

UC Berkeley created mKernel: a unified library for GPU synchronization in clusters

UC Berkeley released mKernel — a new CUDA library for synchronizing thousands of GPUs in data centers, combining local and remote communication in a single persistent kernel.

May 29, 2026·2 min

LLMMarkTechPost

Stability AI Releases Stable Audio 3 for Fast Music Generation

Stability AI introduced Stable Audio 3 — models for music and sound effects generation that run on MacBook and consumer GPUs with 8GB VRAM.

May 29, 2026·2 min

LLMMarkTechPost

ZeroEntropy Unveiled Zerank-2 — A Lightweight Reranker for Precise Search

ZeroEntropy released Zerank-2, a compact cross-encoder based on Qwen3 that significantly improves search quality in two-stage RAG systems.

May 29, 2026·2 min

LLMMarkTechPost

Sakana AI Introduces DiffusionBlocks: A Method for Block-by-Block Neural Network Training

Sakana AI has introduced DiffusionBlocks—a new method that allows training layers of residual neural networks independently by interpreting updates as reverse diffusion.

May 29, 2026·3 min

LLMMarkTechPost

Vector Search in PostgreSQL: Complete Guide to pgvector for AI Applications

PostgreSQL has become a serious competitor to specialized vector databases thanks to the pgvector extension.

May 29, 2026·3 min

LLMMarkTechPost

Perplexity AI Released Tokenizer 5x Faster Than Hugging Face Standard

Perplexity AI released a rewritten Unigram tokenizer that accelerates text processing 5x and reduces CPU load 5-6x in production environments.

May 29, 2026·3 min

LLMMarkTechPost

Scientists Created MEMO — A Framework for Expanding LLM Memory Without Retraining

Scientists proposed MEMO — a framework that allows LLMs to learn from new data without retraining the base model, using a separate memory module.

May 29, 2026·3 min

LLMMarkTechPost

EAGLE 3.1: How to Fix Speculative Decoding Instability in LLMs

A joint release by the EAGLE team, vLLM, and TorchSpec fixes a critical speculative decoding issue — attention drift — that was slowing down large language model inference in production.

May 29, 2026·2 min

LLMMarkTechPost

Anthropic Released Claude Opus 4.8 with Dynamic Workflows and Cheaper Fast Mode

Anthropic introduced Claude Opus 4.8 with dynamic workflows and a more affordable fast mode. The update is available in Claude Code's research preview.

May 29, 2026·3 min

LLMMarkTechPost

Liquid AI Releases LFM2.5-8B: A Compact MoE Model with 128K Context

Liquid AI introduced the new LFM2.5-8B-A1B model — an efficient MoE model that activates only 1.5B parameters out of 8.3B. Runs on consumer PCs with a 128K context window.

May 29, 2026·2 min

LLMMarkTechPost

Hexo Labs Published SIA — an Agent That Updates Itself During Operation

Hexo Labs released the open source code for SIA — a system that improves itself by updating both the agent's instructions and the weights of its neural network.

May 29, 2026·1 min

LLMMarkTechPost

Microsoft Research releases Webwright — browser agent that solves web tasks at 60%

Microsoft Research introduced Webwright — a browser agent that executes complex web tasks better than large language models: 60% success on Odysseys benchmark versus 33.5% for baseline GPT-5.4.

May 25, 2026·2 min