MarkTechPost

Google Presented Auto-Diagnose — an AI System for Finding Causes of Integration Test Failures
Google presented Auto-Diagnose — a system based on Gemini 2.5 Flash that automatically analyzes logs from failed integration tests and ident

OpenAI GPT-OSS: Launching Open-Weight Models in Colab with MXFP4 and Advanced Inference
A new guide explains how to launch openai/gpt-oss-20b in Google Colab: install dependencies for Transformers, check GPU, enable MXFP4 quanti

Anthropic releases Claude Opus 4.7 for agentic programming, vision, and autonomous tasks
Anthropic introduced Claude Opus 4.7 — an update to its flagship model with notable improvements in agentic coding, detailed image analysis,

PrismML Bonsai: How to Run a 1-Bit Model on CUDA with GGUF, JSON and RAG
In a new PrismML Bonsai tutorial, we covered how to run Bonsai-1.7B on GPU via CUDA and GGUF, measure throughput, configure chat, strict JSO

xAI launches separate Grok APIs for speech recognition and synthesis for corporate developers
xAI has split Grok's voice stack into standalone APIs: Speech-to-Text and Text-to-Speech for business with aggressive pricing, multilingual

NVIDIA released Ising — the first open family of AI models for quantum-classical systems
NVIDIA presented Ising — an open family of AI models for quantum processor calibration and error correction to bring useful hybrid quantum-c

Why LoRA Breaks in Production and How RS-LoRA Saves Model Fine-tuning
LoRA excels at style and format, but loses signal when fine-tuning with new facts; RS-LoRA solves the problem by changing the scaling formul

OpenKB and OpenRouter show how to build a local AI knowledge base with Llama search
A new tutorial explains how to deploy a local knowledge base on OpenKB, connect an open model via OpenRouter, and safely configure search wi

OpenAI and Magika showed how to build a pipeline for file recognition and threat analysis
The guide demonstrates a practical pipeline where Magika determines the actual file type from bytes, and OpenAI helps interpret the result a

Meta introduced Sapiens2 — a unified computer vision model for pose, segmentation, and 3D
Meta Reality Labs released Sapiens2 — a family of vision models for human analysis that solves pose, segmentation, surface geometry, and 3D

OpenMOSS releases MOSS-Audio — an open audio model that outperforms larger alternatives
OpenMOSS introduced MOSS-Audio — an open model that understands speech, music, and ambient sounds in a single architecture and outperforms s

OpenAI Embeddings and RL: How to Build an Agent with Long-Term Memory for Accurate Answers
The tutorial shows how to train an RL agent to select relevant records from long-term memory so that an LLM answers questions about saved fa

How to Measure Real Intelligence: Key Benchmarks for AI Agents
Classical tests no longer reflect the real capabilities of neural networks. We explore which benchmarks truly show whether an AI agent is re

Elastic Memory for AI: How kvcached Solves the GPU Shortage
Dynamic KV-cache distribution promises to radically reduce the cost of hosting language models by enabling efficient memory sharing across a

xAI's Voice Model Surpasses GPT Realtime in Business Tasks
Elon Musk's xAI has unveiled grok-voice-think-fast-1.0. The new model outperformed solutions from OpenAI and Google in managing complex busi

GitNexus: How a New Tool Taught AI Agents Structural Code Understanding
AI assistants excel at writing local code but often fail to grasp the global architecture of projects. The open-source GitNexus project solv

DeepSeek-V4: How New Compression Algorithms Made One-Million-Token Context a Reality
DeepSeek introduced the fourth generation of its models. Through radical attention compression, processing massive data volumes is becoming

Decoupled DiLoCo Architecture from DeepMind Solves AI Scaling Problem
Training advanced AI models has always been held hostage by hardware failures: one chip breaking would stop the entire cluster. A new archit

OpenMythos: Open-source PyTorch reconstruction of Claude Mythos architecture with 770M parameters
Developer Kye Gomez reconstructed the presumed Claude Mythos architecture from Anthropic from scratch — without leaks, based only on public

OpenAI Scales GPT-5.4-Cyber for Verified Cybersecurity Professionals
OpenAI opens broad access to GPT-5.4-Cyber — a fine-tuned version of GPT-5.4 without standard restrictions for thousands of verified defende

Moonshot AI releases Kimi K2.6: an agentic model with a swarm of 300 sub-agents
Chinese lab Moonshot AI has open-sourced Kimi K2.6, a multimodal agentic model that autonomously coordinates up to 300 sub-agents and 4,000

Microsoft Phi-4-Mini: implementing quantization, RAG, and LoRA in a single Jupyter notebook
The tutorial shows the full Phi-4-mini-instruct pipeline: 4-bit quantization, streaming generation, reasoning, tool calling, RAG, and LoRA f

Qwen 3.6-35B-A3B in practice: multimodality, MoE, and RAG in a single pipeline
A detailed breakdown of implementing Qwen 3.6-35B-A3B, from model loading to RAG, tool calling, and session persistence in practical workflo

NVIDIA представила Nemotron 3 Super — открытую модель со 120 млрд параметров
Новая гибридная модель сочетает архитектуры Mamba и Attention с подходом Mixture-of-Experts, обеспечивая пятикратный прирост производительно

Google представила TensorFlow 2.21 и LiteRT для мобильного ИИ
Google представила TensorFlow 2.21 и LiteRT — новый стандарт для запуска нейросетей на смартфонах с поддержкой NPU и ускорением GPU для моби

Google запустила Android Bench для оценки ИИ в мобильной разработке
Google открыла доступ к Android Bench — первому специализированному инструменту для проверки навыков больших языковых моделей в разработке п

OpenAI представила Codex Security для автоматического поиска и исправления уязвимостей в коде
OpenAI запускает Codex Security — интеллектуального агента для анализа безопасности кода, который не только находит ошибки, но и предлагает

Liquid AI выпустила систему для запуска ИИ-агентов полностью на устройстве
Компания Liquid AI представила модель LFM2-24B-A2B и приложение LocalCowork — связку для выполнения сложных рабочих процессов с ИИ-агентами

Yuan 3.0 Ultra: триллион параметров при рекордной эффективности
Китайская YuanLab AI представила открытую мультимодальную модель с триллионом параметров, которая активирует лишь 68,8 млрд из них. Сокращен

Alibaba выпустила OpenSandbox — единую среду для безопасной работы AI-агентов
Alibaba открыла исходный код OpenSandbox — инструмента, который даёт AI-агентам изолированные песочницы для выполнения кода, веб-сёрфинга и