Together AI Showcases Nine Papers at ICML 2026 in Seoul

9BED484F63152ECD2721498B93AEE806A0F7F6C0430821D708627253D13A3405.jpg

Together AI Showcases Nine Papers at ICML 2026 in Seoul

Together AI is making a significant mark at the International Conference on Machine Learning (ICML) 2026, with nine research papers accepted for presentation. Held from July 6–11 in Seoul, South Korea, ICML is among the top global machine learning conferences, and this year received a record 24,371 submissions, accepting only 26.6%. Together AI’s contributions span the entire AI stack, from agent systems to GPU kernel optimization, reflecting the company’s focus on vertically integrated research and production-scale AI solutions.

Key highlights from Together AI’s research include:

1. Cutting-Edge AI Agents

At the forefront of AI agent research, Together AI introduced three notable advancements:

DSGym: A framework for evaluating and training data science agents, featuring over 1,000 tasks across 10+ domains. Notably, DSGym eliminates common benchmarking loopholes by requiring agents to work directly with datasets rather than recalling pre-learned answers.
ThunderAgent: A novel inference system delivering up to 3.6x faster throughput for multi-turn agent workflows by treating workflows as first-class objects.
TTT-Discover: A method for state-of-the-art discoveries in fields like mathematics, biology, and GPU kernel design using open models, achieving exceptional results at a fraction of the cost of proprietary systems.

2. Model Shaping and Reasoning

The company’s contributions to model training emphasize improving reasoning capabilities without scaling compute:

RARO: A framework enabling AI models to achieve RL-grade reasoning on tasks without clear answer keys, such as poetry or financial analysis.
V1: A training technique that boosts answer accuracy by 10% without requiring additional compute, leveraging pairwise comparison for better output selection.

3. Efficiency and Optimization

Together AI is also tackling the challenge of optimizing inference and hardware utilization:

Aurora: An adaptive speculative decoding system that improves model inference speeds by 1.25x as traffic patterns evolve, demonstrating its utility in real-time production environments.
Untied Ulysses: A memory-efficient method enabling 5 million-token context training on a single GPU node, reducing memory usage by up to 87.5%.
Opportunistic Expert Activation (OEA): A routing method that cuts Mixture-of-Experts (MoE) decode latency by up to 39%, reclaiming efficiency lost during batch inference.

4. Low-Level Hardware Innovations

At the GPU kernel level, Together AI’s ParallelKernelBench benchmark evaluates multi-GPU kernel generation, highlighting the challenges of scaling AI models across distributed systems. The benchmark offers 87 real workloads, emphasizing the need for efficient multi-GPU communication and computation.

Why This Matters

Together AI’s presence at ICML 2026 underscores its ambition to lead in AI research and production-scale efficiency. By addressing challenges across the full stack—from high-level agent design to hardware-level optimizations—the company is positioning itself as a key player in scalable AI solutions. For context, ICML 2026 has already garnered attention for its use of agentic AI peer reviewers, reflecting the increasing integration of automation in the research process.

Attendees can learn more about Together AI’s work at booth B714 throughout the conference. The team is also actively recruiting researchers and engineers to further its mission of developing AI-native cloud infrastructures and next-generation machine learning systems.

For more details on the research and to explore career opportunities, visit Together AI.

Image source: Shutterstock

Source link

Together AI Showcases Nine Papers at ICML 2026 in Seoul

1. Cutting-Edge AI Agents

2. Model Shaping and Reasoning

3. Efficiency and Optimization

4. Low-Level Hardware Innovations

Why This Matters

Be the first to comment

Leave a Reply Cancel reply

Israel sanctions IRGC crypto wallets; Polymarket holds Iran blockade at 31.5%

1. Cutting-Edge AI Agents

2. Model Shaping and Reasoning

3. Efficiency and Optimization

4. Low-Level Hardware Innovations

Why This Matters

Related Articles

OP Price Prediction: $0.085 Floor Test Before $0.15 Rally – 65% Probability

Ethereum Introduces Clear Signing to Combat Blind Signing Risks

ARB Price Prediction: Dead or Deeply Oversold? The $0.08 Level Is ARB’s Last Structural Stand

Be the first to comment

Leave a Reply Cancel reply