Skip to content

AI-Ready Scoring Reference

The AI-Ready score evaluates how well an RO-Crate meets criteria for responsible, reproducible AI research. It is computed across 7 categories with 28 sub-criteria.

Each sub-criterion is scored as either met (property present with content) or not met (missing/empty). Some criteria are inferred from the RO-Crate structure itself and are always met.

For release-level RO-Crates, aggregated metrics (evi:* properties) are used when available, falling back to counting entities in the metadata graph.


0. Fairness (FAIR Principles)

Sub-criterion Properties Checked Logic
Findable identifier (DOI), @id Direct -- checks DOI first, falls back to @id
Accessible -- Inferred (always met: RO-Crate is machine-readable)
Interoperable -- Inferred (always met: schema.org + RO-Crate standard)
Reusable license Direct

1. Provenance

Sub-criterion Properties Checked Logic
Transparent Count of Dataset entities (or evi:datasetCount) Counted
Traceable Count of Computation + Experiment entities (or evi:computationCount) Counted
Interpretable Count of Software entities (or evi:softwareCount) Counted
Key Actors Identified author, publisher, principalInvestigator Direct -- any combination triggers

2. Characterization

Sub-criterion Properties Checked Logic
Semantics -- Inferred (always met: RO-Crate provides semantic context)
Statistics contentSize, hasSummaryStatistics (or evi:totalContentSizeBytes, evi:entitiesWithSummaryStats) Counted -- checks total size and summary stats availability
Standards Count of Schema entities (or evi:schemaCount) Counted
Potential Sources of Bias rai:dataBiases Direct
Data Quality rai:dataCollectionMissingData Direct

3. Pre-Model Explainability

Sub-criterion Properties Checked Logic
Data Documentation Template dct:conformsTo Inferred (always met: RO-Crate + Croissant RAI 1.0). Optional: set dct:conformsTo: "http://mlcommons.org/croissant/RAI/1.0" on the root entity to declare explicit conformance
Fit for Purpose rai:dataUseCases, rai:dataLimitations Direct -- either triggers
Verifiable md5, MD5, sha256, SHA256, hash on Dataset/Software entities (or evi:totalEntities, evi:entitiesWithChecksums) Counted -- reports percentage of files with checksums

4. Ethics

Sub-criterion Properties Checked Logic
Ethically Acquired rai:dataCollection, humanSubjects (fallback: additionalProperty "Human Subject") Direct -- either triggers
Ethically Managed ethicalReview, dataGovernanceCommittee (fallback: additionalProperty "Data Governance Committee") Direct -- either triggers
Ethically Disseminated license, rai:personalSensitiveInformation, prohibitedUses (fallback: additionalProperty "Prohibited Uses") Direct -- any combination triggers
Secure confidentialityLevel Direct

5. Sustainability

Sub-criterion Properties Checked Logic
Persistent identifier (DOI), @id Direct -- checks DOI first, falls back to @id
Domain Appropriate rai:dataReleaseMaintenancePlan Direct
Well Governed dataGovernanceCommittee (fallback: additionalProperty "Data Governance Committee") Direct
Associated -- Inferred (always met: RO-Crate links all components)

6. Computability

Sub-criterion Properties Checked Logic
Standardized format values across Dataset/Software entities (or evi:formats) Counted -- lists up to 5 unique formats
Computationally Accessible publisher Direct
Portable -- Inferred (always met: RO-Crate standard)
Contextualized -- Inferred (always met: RO-Crate graph structure)

Cross-Reference: Properties That Affect Both Datasheet and Scoring

These properties appear in the datasheet and influence the AI-Ready score. Filling them in gives you the most value.

RO-Crate Property Datasheet Section(s) AI-Ready Category
@id Overview Fairness (findable), Sustainability (persistent)
identifier (DOI) Overview Fairness (findable), Sustainability (persistent)
license Overview, Distribution Fairness (reusable), Ethics (ethically disseminated)
author Overview Provenance (key actors)
publisher Overview, Distribution Provenance (key actors), Computability (computationally accessible)
principalInvestigator Overview Provenance (key actors)
contentSize Overview Characterization (statistics)
confidentialityLevel Overview Ethics (secure)
ethicalReview Overview Ethics (ethically managed)
dataGovernanceCommittee Overview Ethics (ethically managed), Sustainability (well governed)
humanSubjects Overview Ethics (ethically acquired)
rai:dataBiases Use Cases Characterization (bias)
rai:dataCollectionMissingData Use Cases Characterization (data quality)
rai:dataUseCases Use Cases Pre-Model Explainability (fit for purpose)
rai:dataLimitations Use Cases Pre-Model Explainability (fit for purpose)
rai:dataCollection Use Cases Ethics (ethically acquired)
rai:personalSensitiveInformation Use Cases Ethics (ethically disseminated)
prohibitedUses Use Cases Ethics (ethically disseminated)
rai:dataReleaseMaintenancePlan Use Cases Sustainability (domain appropriate)

Properties That Only Affect AI-Ready Score

These properties are not shown on the datasheet but do influence scoring:

Property AI-Ready Category Notes
hasSummaryStatistics Characterization (statistics) Reference to a summary stats entity
md5 / sha256 / hash Pre-Model Explainability (verifiable) On individual Dataset/Software entities
format Computability (standardized) On individual Dataset/Software entities
Entity counts (Dataset, Software, Computation, Experiment, Schema) Provenance, Characterization Counted from metadata graph or evi:* aggregates

Aggregated Release Metrics (evi:* properties)

For release-level RO-Crates, these pre-aggregated properties are checked first:

Property Replaces
evi:datasetCount Counting Dataset entities
evi:computationCount Counting Computation/Experiment entities
evi:softwareCount Counting Software entities
evi:schemaCount Counting Schema entities
evi:totalContentSizeBytes Summing contentSize across entities
evi:entitiesWithSummaryStats Counting entities with hasSummaryStatistics
evi:totalEntities Counting Dataset/Software entities for checksum percentage
evi:entitiesWithChecksums Counting entities with md5/sha256/hash
evi:formats Collecting unique format values