Skip to content

DRAFT: Record per episode metadata and results#806

Closed
alexmillane wants to merge 25 commits into
alex/feature/evaluation_visualizationfrom
alex/feature/record_per_episode_metadata_and_results
Closed

DRAFT: Record per episode metadata and results#806
alexmillane wants to merge 25 commits into
alex/feature/evaluation_visualizationfrom
alex/feature/record_per_episode_metadata_and_results

Conversation

@alexmillane

Copy link
Copy Markdown
Collaborator

Summary

Record per-episode data.

Detailed description

  • Adds a per-episode data recorder to our env that accumulates per-episode data and optionally writes it to disk.
  • Getting at the per environment success flag requires overriding internal an internal method of IsaacLabManagerBasedRLEnv

Record every value drawn by an enabled variation's sampler so
downstream sensitivity-analysis tooling has the input factors that
produced each episode.

This adds a sample-observer layer the recorder builds on: SamplerBase
gains add_listener/remove_listener and a sample() template method that
notifies listeners around the concrete _sample(); VariationBase gains
add_sample_listener/remove_sample_listener, re-binding subscriptions onto
the sampler rebuilt by apply_cfg so they survive cfg swaps.

ArenaEnvBuilder constructs a VariationRecorder, attaches it after Hydra
overrides but before any sampling, and exposes it on env.unwrapped. The
recorder is stashed on the builder rather than the env cfg because the
configclass __post_init__ deep-copies its attributes, which would orphan
the listener closures.

Signed-off-by: alex <amillane@nvidia.com>
…isode data into visualizer, make camera obs recorder use centralized episode idx.

@aiguldzh-nvidia aiguldzh-nvidia left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The per-episode recording contains almost all of the data we need. The only thing missing is some job-level metadata, but I don't think that's a blocker, the current approach looks good to me.

# Request the per-episode results write into this job's output subdir
# (the same directory the video recorders use), one file per rebuild.
results_path = os.path.join(video_cfg.video_base_dir, f"episode_results_rebuild{rebuild_idx}.jsonl")
env.unwrapped.episode_results_recorder.write(results_path)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write() is inside the try block, so if it raises (e.g. a permissions error) the except catches it and marks the job as failed, even though the rollout itself completed successfully

@alexmillane

Copy link
Copy Markdown
Collaborator Author

Superceded by #810

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants