Dataset Endpoints

Endpoints for creating, retrieving, downloading, and deleting individual dataset records. Datasets can be uploaded with their file content or registered with only metadata (for external or embargoed data).

All URLs are relative to https://fairscape.net/api.


Endpoint Summary

Method Path Auth Required Description
POST /dataset Create a dataset record (with optional file upload)
GET /dataset/ark:{naan}/{postfix} Get dataset metadata
GET /dataset/download/ark:{naan}/{postfix} Download dataset file content
DELETE /dataset/ark:{NAAN}/{postfix} Delete a dataset

POST /dataset

Create a new dataset record. Accepts multipart/form-data with a JSON metadata field and an optional file.

Form Fields:

Field Type Description
datasetMetadata JSON string Dataset metadata (see schema below)
uploadFile file (Optional) The actual dataset file

Metadata schema (key fields):

{
  "@id": "ark:59853/my-dataset-2024",
  "@type": "Dataset",
  "name": "AP-MS Embeddings",
  "description": "Protein interaction embeddings",
  "author": "Krogan Lab",
  "version": "1.0",
  "datePublished": "2024-01-15",
  "keywords": ["proteomics"],
  "dataFormat": "CSV"
}
import json

metadata = {
    "@id": "ark:59853/apms-embeddings-2024",
    "@type": "Dataset",
    "name": "AP-MS Embeddings",
    "description": "APMS embeddings for each protein",
    "author": "Krogan Lab",
    "version": "1.0",
    "datePublished": "2024-01-15",
    "keywords": ["proteomics", "b2ai"],
    "dataFormat": "CSV"
}

with open("embeddings.csv", "rb") as f:
    response = requests.post(
        f"{BASE_URL}/dataset",
        data={"datasetMetadata": json.dumps(metadata)},
        files={"uploadFile": ("embeddings.csv", f, "text/csv")},
        headers={"Authorization": f"Bearer {token}"}
    )
print(response.json())
curl -X POST "https://fairscape.net/api/dataset" \
     -H "Authorization: Bearer <token>" \
     -F 'datasetMetadata={"@id":"ark:59853/apms-embeddings-2024","name":"AP-MS Embeddings","@type":"Dataset",...}' \
     -F "uploadFile=@embeddings.csv;type=text/csv"

GET /dataset/ark:{naan}/{postfix}

Retrieve metadata for a dataset. Public — no authentication required.

response = requests.get(
    f"{BASE_URL}/dataset/ark:59853/apms-embeddings-2024"
)
print(response.json())
curl "https://fairscape.net/api/dataset/ark:59853/apms-embeddings-2024"

Note

You can also use the universal ARK resolver (GET /ark:{naan}/{postfix}) to retrieve dataset metadata — the result is the same.


GET /dataset/download/ark:{naan}/{postfix}

Download the file content for a dataset. The response Content-Type is inferred from the filename.

response = requests.get(
    f"{BASE_URL}/dataset/download/ark:59853/apms-embeddings-2024",
    headers={"Authorization": f"Bearer {token}"},
    stream=True
)
with open("downloaded-embeddings.csv", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)
curl -o embeddings.csv \
     -H "Authorization: Bearer <token>" \
     "https://fairscape.net/api/dataset/download/ark:59853/apms-embeddings-2024"

DELETE /dataset/ark:{NAAN}/{postfix}

Delete a dataset record and its associated file content.

response = requests.delete(
    f"{BASE_URL}/dataset/ark:59853/apms-embeddings-2024",
    headers={"Authorization": f"Bearer {token}"}
)
print(response.status_code)
curl -X DELETE "https://fairscape.net/api/dataset/ark:59853/apms-embeddings-2024" \
     -H "Authorization: Bearer <token>"