2026-05-16

15件

企業動向 OpenAI 2026-05-16

OpenAI and Malta partner to bring ChatGPT Plus to all citizens

OpenAI and Malta partner to expand AI access, offering ChatGPT Plus and training to help citizens build practical AI skills and use AI responsibly....

企業動向 Microsoft Research 2026-05-15

Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability

Our recent paper, “LLMs Corrupt Your Documents When You Delegate”, has generated discussion about the reliability of AI systems in delegated workflows. We appreciate the interest in this work and want to clarify several important points about what the paper does—and does not—claim. The research aims...

#llm#benchmark

論文 arXiv 2026-05-14

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both

Visual reasoning, often interleaved with intermediate visual states, has emerged as a promising direction in the field. A straightforward approach is to directly generate images via unified models during reasoning, but this is computationally expensive and architecturally non-trivial. Recent alterna...

#agent#rl#benchmark

論文 arXiv 2026-05-14

Evidential Reasoning Advances Interpretable Real-World Disease Screening

Disease screening is critical for early detection and timely intervention in clinical practice. However, most current screening models for medical images suffer from limited interpretability and suboptimal performance. They often lack effective mechanisms to reference historical cases or provide tra...

#benchmark

論文 arXiv 2026-05-14

Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

Reconstructing precise clinical timelines is essential for modeling patient trajectories and forecasting risk in complex, heterogeneous conditions like sepsis. While unstructured clinical narratives offer semantically rich and contextually complete descriptions of a patient's course, they often lack...

#multimodal#rag#alignment#llm#benchmark

論文 arXiv 2026-05-14

MeMo: Memory as a Model

Large language models (LLMs) achieve strong performance across a wide range of tasks, but remain frozen after pretraining until subsequent updates. Many real-world applications require timely, domain-specific information, motivating the need for efficient mechanisms to incorporate new knowledge. In ...

#llm#benchmark

論文 arXiv 2026-05-14

Self-Distilled Agentic Reinforcement Learning

Reinforcement learning (RL) has emerged as a central paradigm for post-training LLM agents, yet its trajectory-level reward signal provides only coarse supervision for long-horizon interaction. On-Policy Self-Distillation (OPSD) complements RL by introducing dense token-level guidance from a teacher...

#agent#rl#llm#benchmark

論文 arXiv 2026-05-14

APWA: A Distributed Architecture for Parallelizable Agentic Workflows

Autonomous multi-agent systems based on large language models (LLMs) have demonstrated remarkable abilities in independently solving complex tasks in a wide breadth of application domains. However, these systems hit critical reasoning, coordination, and computational scaling bottlenecks as the size ...

#agent#llm#benchmark

論文 arXiv 2026-05-14

Understanding How International Students in the U.S. Are Using Conversational AI to Support Cross-Cultural Adaptation

Moving to a new culture and adapting to a new life, as an international student, can be a stressful experience. In the US, international students face unique overlapping challenges, yet the current support ecosystem, including university support systems and informal social networks, remains largely ...

論文 arXiv 2026-05-14

Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs

Function calling, also known as tool use, is a core capability of modern LLM agents but is typically constrained by synchronous execution semantics. Under these semantics, LLM decoding is blocked until each function call completes, resulting in increasing end-to-end latency. In this work, we introdu...

#llm#coding#benchmark#agent#fine-tuning

企業動向 OpenAI 2026-05-15

A new personal finance experience in ChatGPT

Preview a new personal finance experience in ChatGPT for Pro users in the U.S. Securely connect your financial accounts and get AI-powered insights and guidance grounded in your financial context, goals, and priorities....

モデル OpenAI 2026-05-15

Databricks brings GPT-5.5 to enterprise agent workflows

Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark....

#agent#benchmark

企業動向 OpenAI 2026-05-15

2026-05-16

OpenAI and Malta partner to bring ChatGPT Plus to all citizens

Further Notes on Our Recent Research on AI Delegation and Long-Horizon Reliability

ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both

Evidential Reasoning Advances Interpretable Real-World Disease Screening

Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment

MeMo: Memory as a Model

Self-Distilled Agentic Reinforcement Learning

APWA: A Distributed Architecture for Parallelizable Agentic Workflows

Understanding How International Students in the U.S. Are Using Conversational AI to Support Cross-Cultural Adaptation

Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs

A new personal finance experience in ChatGPT

Databricks brings GPT-5.5 to enterprise agent workflows

How sales teams use Codex

How business operations teams use Codex

How data science teams use Codex