AgentCity

A multi-agent framework for continuous construction and evaluation of spatiotemporal benchmarks on top of LibCity

Get Started View LeaderBoard

Overview

AgentCity retrieves relevant papers based on user-specified keywords, migrates their implementations into LibCity under unified interfaces, and evaluates the migrated models with automated hyperparameter tuning.

8+
Migrated Models
3
Processing Stages
100%
Automated

Key Features

Literature Retrieval

Automatically retrieves and analyzes relevant research papers based on your specified keywords and research topics.

Model Migration

Seamlessly migrates model implementations into LibCity with unified interfaces and consistent coding standards.

Hyperparameter Tuning

Automated hyperparameter optimization ensures models achieve their best performance on target datasets.

Architecture

Global Coordinator

AgentCity is organized around a Global Coordinator, which executes three stages in sequence. Each stage is managed by a Stage Leader Agent that is responsible for planning the workflow, delegating tasks, and monitoring progress.

Concrete operations are carried out by specialized Subagents, each designed for a narrowly defined function (e.g., paper analysis, code adaptation, or evaluation).

1

Literature Retrieval

Search and analyze academic papers, extract relevant methodologies and implementations.

2

Model Migration

Adapt and integrate model implementations into LibCity's unified framework with proper interfaces.

3

Hyperparameter Tuning

Optimize model parameters and evaluate performance across multiple benchmark datasets.

Robustness & Refinement

To ensure robustness under heterogeneous inputs, each stage supports a bounded refinement loop. When a stage encounters unsatisfactory outcomes—such as low-quality search results, migration errors, or unstable evaluation performance—the Stage Leader analyzes the feedback and selectively re-invokes relevant Subagents to refine the result.

These refinement loops are explicitly bounded by a maximum number of iterations, ensuring predictable execution time and preventing uncontrolled retries.

Intermediate results from each stage are summarized into structured notes and passed to downstream stages by the Global Coordinator. This design allows later stages to reuse upstream decisions and constraints without reprocessing the same information, while keeping stage responsibilities clearly separated.