Search results for "AGENT"
Today
08:20

Alibaba Cloud Launches JVS Crew, Enterprise-Grade AI Agent Platform

Gate News message, April 23 — Alibaba Cloud officially released JVS Crew, an enterprise-grade AI Agent construction platform designed with an "integration-first" approach. The platform enables enterprises to quickly embed AI Agent capabilities into existing apps, SaaS services, or smart hardware
More
04:54

Perplexity Discloses Web Search Agent Post-Training Method; Qwen3.5-Based Model Outperforms GPT-5.4 on Accuracy and Cost

Perplexity uses SFT followed by RL with Qwen3.5 models, leveraging a multi-hop QA dataset and rubric checks to boost search accuracy and efficiency, achieving best-in-class FRAMES performance. Abstract: Perplexity's post-training workflow for web-search agents combines supervised fine-tuning (SFT) to enforce instruction-following and language consistency with online reinforcement learning (RL) via the GRPO algorithm. The RL stage uses a proprietary multi-hop verifiable QA dataset and rubric-based conversational data to prevent SFT drift, with reward gating and within-group efficiency penalties. Evaluation shows Qwen3.5-397B-SFT-RL achieving top FRAMES performance, 57.3% accuracy with a single tool call and 73.9% with four calls at $0.02 per query, outperforming GPT-5.4 and Claude Sonnet 4.6 on these metrics. Pricing is API-based and excludes caching.
More
03:49

OpenAI Codex Team Fixes OpenClaw Authentication Bug, Significantly Improves Agent Behavior

OpenClaw switches from Pi to Codex harness to fix a silent authentication fallback, with two PRs addressing the bridge and fallback; post-fix, the agent shifts from shallow heartbeat polling to a full work loop, enabling progress. Abstract: OpenClaw’s Codex harness optimization addressed a critical authentication flaw that caused silent fallback to the Pi harness when using Codex with OpenAI models. Two pull requests fix the authentication bridge and prevent silent fallback, changing the runtime adapter. As a result, agent behavior evolves from shallow heartbeat polling to a full work loop that reads context, analyzes tasks, edits repositories, and verifies progress, improving continuity and visibility across heartbeats.
More