Getting Started
RO-Crate¶
To perform any RO-Crate operation, simply use the rocrate
sub-command within the fairscape-cli
root command.
Create RO-Crate¶
To create an RO-Crate, you have the option to use either the create
or init
sub-commands. With create
, you can specify the destination directory using the ROCRATE_PATH
argument, whereas init
creates the RO-Crate in the current working directory. Both sub-commands require five parameters: name
, description
, keywords
, organization-name
, and project-name
, as well as an optional guid
parameter. To view all available options and arguments, simply enter the command fairscape-cli rocrate create --help
to display a comprehensive list.
Usage: fairscape-cli rocrate create [OPTIONS] ROCRATE_PATH
Create an ROCrate in a new path specified by the rocrate-path argument
Options:
--guid TEXT
--name TEXT [required]
--organization-name TEXT [required]
--project-name TEXT [required]
--description TEXT [required]
--keywords TEXT [required]
--help Show this message and exit.
To create an RO-Crate with minimal metadata, use the following command. This will generate a unique identifier and create a ro-crate-metadata.json
file at the specified ROCRATE_PATH
location.
fairscape-cli rocrate create \
--name "test rocrate" \
--description "Example RO Crate for Tests" \
--organization-name "UVA" \
--project-name "B2AI" \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
"./test_rocrate"
Alternatively, use the fairscape-cli rocrate init
command to create the same RO-Crate in the current working directory.
fairscape-cli rocrate init \
--name "test rocrate" \
--description "Example RO Crate for Tests" \
--organization-name "UVA" \
--project-name "B2AI" \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS"
Add object and metadata¶
In the FAIRSCAPE ecosystem, datasets and software are treated as objects that can be added to an RO-Crate using the add
sub-command. This command fetches the object and transfers it to the crate. Enter the command fairscape-cli rocrate add --help
to display the list of objects to add.
Usage: fairscape-cli rocrate add [OPTIONS] COMMAND [ARGS]...
Add (transfer) object to RO-Crate and register object metadata.
Options:
--help Show this message and exit.
Commands:
dataset Add a Dataset file and its metadata to the RO-Crate.
software Add a Software and its corresponding metadata.
Dataset object¶
The sub-command below, labeled as add dataset
, utilizes necessary options to add a dataset object to the crate and populate corresponding metadata in the ro-crate-metadata.json
file. An identifier is generated to uniquely represent the dataset. It requires eight parameters including name
, author
, version
, date-published
, description
, data-format
, source-filepath
, and destination-filepath
. Additional parameters are optional. The dataset metadata is then added to the ro-crate-metadata.json
, and the dataset object is transferred to the specified location in ROCRATE_PATH
. Enter fairscape-cli rocrate add dataset --help
to show its use:
Usage: fairscape-cli rocrate add dataset [OPTIONS] ROCRATE_PATH
Add a Dataset file and its metadata to the RO-Crate.
Options:
--guid TEXT
--name TEXT [required]
--url TEXT
--author TEXT [required]
--version TEXT [required]
--date-published TEXT [required]
--description TEXT [required]
--keywords TEXT [required]
--data-format TEXT [required]
--source-filepath TEXT [required]
--destination-filepath TEXT [required]
--used-by TEXT
--derived-from TEXT
--schema TEXT
--associated-publication TEXT
--additional-documentation TEXT
--help Show this message and exit.
The example below utilizes necessary options to add a dataset object to the crate and populate corresponding metadata in the ro-crate-metadata.json
file.
fairscape-cli rocrate add dataset \
--name "AP-MS embeddings" \
--author "Krogan lab (https://kroganlab.ucsf.edu/krogan-lab)" \
--version "1.0" \
--date-published "2021-04-23" \
--description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study, generated by node2vec predict." \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
--data-format "CSV" \
--source-filepath "./tests/data/APMS_embedding_MUSIC.csv" \
--destination-filepath "./test_rocrate/APMS_embedding_MUSIC.csv" \
"./test_rocrate"
The example below performs the same operation utilizing both required and optional parameters:
fairscape-cli rocrate add dataset \
--guid "ark:5982/UVA/B2AI/example_rocrate/AP-MS_embeddings-Dataset" \
--name "AP-MS embeddings" \
--url "https://github.com/idekerlab/MuSIC/blob/master/Examples/APMS_embedding.MuSIC.csv" \
--author "Krogan lab (https://kroganlab.ucsf.edu/krogan-lab)" \
--version "1.0" \
--date-published "2021-04-23" \
--description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study, generated by node2vec predict." \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
--data-format "CSV" \
--source-filepath "./tests/data/APMS_embedding_MUSIC.csv" \
--destination-filepath "./test_rocrate/APMS_embedding_MUSIC.csv" \
--used-by "create labeled training & test sets random_forest_samples.py" \
--derived-from "node2vec predict" \
--associated-publication "Qin, Y. et al. A multi-scale map of cell structure fusing protein images and interactions" \
--additional-documentation "https://idekerlab.ucsd.edu/music/" \
"./test_rocrate"
One of the features offered by fairscape-cli
is the ability to annotate certain types of dataset objects with schema-level metadata. The examples in Schema Metadata demonstrate how to describe the schema of a dataset object as metadata. This feature includes a mechanism to validate the metadata against the object.
Software object¶
To add a software object, use the software
sub-command, which requires eight parameters, namely name
, author
, version
, description
, file-format
, source-filepath
, destination-filepath
, and date-modified
. Five additional parameters are optional. Metadata about the software is added to the ro-crate-metadata.json
file, and the software object is sent to the location specified by ROCRATE_PATH
. Enter fairscape-cli rocrate add software --help
to show its use:
Usage: fairscape-cli rocrate add software [OPTIONS] ROCRATE_PATH
Add a Software and its corresponding metadata.
Options:
--guid TEXT
--name TEXT [required]
--author TEXT [required]
--version TEXT [required]
--description TEXT [required]
--keywords TEXT [required]
--file-format TEXT [required]
--url TEXT
--source-filepath TEXT [required]
--destination-filepath TEXT [required]
--date-modified TEXT [required]
--used-by-computation TEXT
--associated-publication TEXT
--additional-documentation TEXT
--help Show this message and exit.
The example below uses the required options to add a software object to the crate and populate the associated metadata within the metadata file ro-crate-metadata.json
. An automatic identifier is generated to uniquely represent the software.
fairscape-cli rocrate add software \
--name "calibrate pairwise distance" \
--author "Qin, Y." \
--version "1.0" \
--description "script written in python to calibrate pairwise distance." \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
--file-format "py" \
--source-filepath "./tests/data/calibrate_pairwise_distance.py" \
--destination-filepath "./test_rocrate/calibrate_pairwise_distance.py" \
--date-modified "2021-04-23" \
"./test_rocrate"
The same operation can be performed using both required and optional parameters with the following command.
fairscape-cli rocrate add software \
--guid "ark:5982/UVA/B2AI/example_rocrate/calibrate_pairwise_distance-Software" \
--name "calibrate pairwise distance" \
--author "Qin, Y." \
--version "1.0" \
--description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study, generated by node2vec predict." \
--keywords "b2ai" \
--keywords "U2OS" \
--file-format "py" \
--url "https://github.com/idekerlab/MuSIC/blob/master/calibrate_pairwise_distance.py" \
--source-filepath "./tests/data/calibrate_pairwise_distance.py" \
--destination-filepath "./test_rocrate/calibrate_pairwise_distance.py" \
--date-modified "2021-06-20" \
--used-by-computation "ARK:compute_standard_proximities.1/f9aa5f3f-665a-4ab9-8879-8d0d52f05265" \
--associated-publication "Qin, Y. et al. A multi-scale map of cell structure fusing protein images and interactions. Nature 600, 536–542 2021" \
--additional-documentation "https://idekerlab.ucsd.edu/music/" \
"./test_rocrate"
Register metadata¶
Registering metadata adds the metadata of an object (dataset, object) or an activity (computation) to the ro-crate-metadata.json
. Before the execution of the register
sub-command, objects are required to be present in the path specified by the --filepath
option, hence, no transfer of objects takes place during the execution. There is no similar requirement to specify a path for registering a computation as an activity.
Enter fairscape-cli rocrate register --help
to show its use:
Usage: fairscape-cli rocrate register [OPTIONS] COMMAND [ARGS]...
Add a metadata record to the RO-Crate for a Dataset, Software, or
Computation
Options:
--help Show this message and exit.
Commands:
computation Register a Computation with the specified RO-Crate
dataset Register Dataset object metadata with the specified RO-Crate
software Register a Software metadata record to the specified ROCrate
Computation metadata¶
To register a computation, use the register computation
sub-command. In the FAIRSCAPE ecosystem, computation is considered an activity, unlike datasets and software that are treated as objects. This sub-command requires five mandatory parameters: name
, run-by
, date-created
, description
, and keywords
, as well as five optional parameters. Once executed, metadata about the computation is added to ro-crate-metadata.json
in the ROCRATE_PATH
location.
To view all available options and arguments for registering a computation, enter fairscape-cli rocrate register computation --help
:
Usage: fairscape-cli rocrate register computation [OPTIONS] ROCRATE_PATH
Register a Computation with the specified RO-Crate
Options:
--guid TEXT
--name TEXT [required]
--run-by TEXT [required]
--command TEXT
--date-created TEXT [required]
--description TEXT [required]
--keywords TEXT [required]
--used-software TEXT
--used-dataset TEXT
--generated TEXT
--help Show this message and exit.
The register computation
sub-command can also be used to populate the metadata of a computation within ro-crate-metadata.json
using only the necessary options. Additionally, a unique identifier is generated automatically to represent the computation.
fairscape-cli rocrate register computation \
--name "calibrate pairwise distance" \
--run-by "Qin, Y." \
--date-created "2021-05-23" \
--description "Average the predicted proximities" \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
"./test_rocrate"
The same operation can be performed using both required and optional parameters with the following command.
fairscape-cli rocrate register computation \
--guid "ark:5982/UVA/B2AI/test_rocrate/calibrate_pairwise_distance-Computation" \
--name "calibrate pairwise distance" \
--run-by "Qin, Y." \
--command "some command" \
--date-created "2021-05-23" \
--description "Average the predicted proximities" \
--keywords "b2ai" \
--keywords "clustering" \
--used-software "random_forest_output (https://github.com/idekerlab/MuSIC/blob/master/random_forest_output.py)" \
--used-dataset "IF_emd_1_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_1.pkl" \
--used-dataset "IF_emd_2_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_1.pkl" \
--used-dataset "IF_emd_1_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_2.pkl" \
--used-dataset "IF_emd_2_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_2.pkl" \
--used-dataset """Fold 1 proximities: IF_emd_1_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_3.pkl""" \
--used-dataset "IF_emd_2_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_3.pkl" \
--used-dataset """Fold 1 proximities: IF_emd_1_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_4.pkl""" \
--used-dataset "IF_emd_2_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_4.pkl" \
--used-dataset """Fold 1 proximities: IF_emd_1_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_5.pkl""" \
--used-dataset "IF_emd_2_APMS_emd_1.RF_maxDep_30_nEst_1000.fold_5.pkl" \
--generated "averages of predicted protein proximities (https://github.com/idekerlab/MuSIC/blob/master/Examples/MuSIC_predicted_proximity.txt)" \
"./test_rocrate"
Dataset metadata¶
To register a dataset, use the register dataset
sub-command and include the filepath
option to specify the source file path. This command adds metadata about the dataset to ro-crate-metadata.json
in the ROCRATE_PATH
directory.
To view all available options and arguments for registering a dataset, enter fairscape-cli rocrate register dataset --help
:
Usage: fairscape-cli rocrate register dataset [OPTIONS] ROCRATE_PATH
Register Dataset object metadata with the specified RO-Crate
Options:
--guid TEXT
--name TEXT [required]
--url TEXT
--author TEXT [required]
--version TEXT [required]
--date-published TEXT [required]
--description TEXT [required]
--keywords TEXT [required]
--data-format TEXT [required]
--filepath TEXT [required]
--used-by TEXT
--derived-from TEXT
--schema TEXT
--associated-publication TEXT
--additional-documentation TEXT
--help Show this message and exit.
Execute the following command to use all available options and argument for registering a dataset:
fairscape-cli rocrate register dataset \
--guid "ark:5982/UVA/B2AI/example_rocrate/AP-MS_embeddings-Dataset" \
--name "AP-MS embeddings" \
--url "https://github.com/idekerlab/MuSIC/blob/master/Examples/APMS_embedding.MuSIC.csv" \
--author "Krogan lab (https://kroganlab.ucsf.edu/krogan-lab)" \
--version "1.0" \
--date-published "2021-04-23" \
--description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study, generated by node2vec predict." \
--keywords "apms" \
--keywords "b2ai" \
--keywords "cm4ai" \
--data-format "CSV" \
--filepath "./test_rocrate/APMS_embedding_MUSIC.csv" \
--used-by "create labeled training & test sets random_forest_samples.py" \
--derived-from "node2vec predict" \
--associated-publication "Qin, Y. et al. A multi-scale map of cell structure fusing protein images and interactions" \
--additional-documentation "https://idekerlab.ucsd.edu/music/" \
"./test_rocrate"
Software metadata¶
Furthermore, to register software, you can make use of the register software
sub-command. This sub-command necessitates the inclusion of the filepath
option, which specifies the source file path. Upon execution, this command will append metadata about the software to the ro-crate-metadata.json
file in the ROCRATE_PATH
directory.
To view all available options and arguments for registering a software, enter fairscape-cli rocrate register software --help
:
Usage: fairscape-cli rocrate register software [OPTIONS] ROCRATE_PATH
Register a Software metadata record to the specified ROCrate
Options:
--guid TEXT
--name TEXT [required]
--author TEXT [required]
--version TEXT [required]
--description TEXT [required]
--keywords TEXT [required]
--file-format TEXT [required]
--url TEXT
--date-modified TEXT
--filepath TEXT
--used-by-computation TEXT
--associated-publication TEXT
--additional-documentation TEXT
--help Show this message and exit.
Execute the following command to use all available options and argument for registering a software:
fairscape-cli rocrate register software \
--guid "ark:5982/UVA/B2AI/example_rocrate/calibrate_pairwise_distance-Software" \
--name "calibrate pairwise distance" \
--author "Qin, Y." \
--version "1.0" \
--description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study, generated by node2vec predict." \
--keywords "b2ai" \
--keywords "U20S" \
--file-format "py" \
--url "https://github.com/idekerlab/MuSIC/blob/master/calibrate_pairwise_distance.py" \
--filepath "./test_rocrate/calibrate_pairwise_distance.py" \
--date-modified "2021-06-20" \
--used-by-computation "ARK:compute_standard_proximities.1/f9aa5f3f-665a-4ab9-8879-8d0d52f05265" \
--associated-publication "Qin, Y. et al. A multi-scale map of cell structure fusing protein images and interactions. Nature 600, 536–542 2021" \
--additional-documentation "https://idekerlab.ucsd.edu/music/" \
"./test_rocrate"