Generating a Manifest File from SearchAPI for use with the Command Line Transfer
The HuBMAP Command Line Transfer Utility performs a batch transfer of multiple datasets at once by supplying a manifest file that includes the HuBMAP IDs of each dataset and, optionally, a pathway to specific resources within a given dataset. More information about the HuBMAP-CLT can be found here. For the case when a user wishes to download each dataset returned by a query using the SearchAPI, an optional argument in the URL for certain supported endpoints will return the text necessary for a manifest file rather than its usual output.
Creating a Manifest
For general usage of SearchAPI, refer to its Smart API page where you can find a breakdown of its different endpoints as well as example queries.
The following endpoints support generating a manifest file:
/search
/<index_without_prefix>/search
/param-search/<entity_type>
To geneate a manifest, append the argument ?produce-clt-manifest=true
to the URL. So for /search
for example, using the base SearchAPI URL
“https://search.api.hubmapconsortium.org/v3/”, we get “https://search.api.hubmapconsortium.org/v3/search?produce-clt-manifest=true”.
Limitations
Adding this argument prevents the endpoint from returning its normal result. Only the manifest will be furnished. Additionally, the manifest will be generated without paths specified to individual resources within a given dataset. The entire directory will be included as denoted by the “/” in the manifest.
Example query and Manifest
The following is an example of a query,the URL given to generate a manifest, and the manifest to be generated:
https://search.api.hubmapconsortium.org/v3/search?produce-clt-manifest=true”
Request body:
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"donor.group_name": "Vanderbilt TMC"
}
}
],
"filter": [
{
"match": {
"entity_type.keyword": "Dataset"
}
}
]
}
}
}
Generated manifest:
HBM744.FNLN.846 /
HBM658.VPJK.669 /
HBM592.RPKF.946 /
HBM363.TBHH.346 /
HBM322.XJQZ.894 /
HBM749.MTJC.865 /
HBM722.TVXP.469 /
HBM223.JQLM.452 /
HBM524.KHPH.599 /
Using the Generated Manifest
The generated manifest is returned as plain text. In order to use it with the HuBMAP-CLT, it must be made into a file.
For the following example, we’ll be using the utility curl
to interact with the SearchAPI and generate a manifest. For
more information on using or installing curl, consult the curl Documentation
Example:
Using the example query above, say we wish to generate a manifest containing datasets with ancestor donors containing group_name
of Vanderbilt TMC
.
Our url will be https://search.api.hubmapconsortium.org/v3/search?produce-clt-manifest=true
. Curl allows us to download our text as a file with any name desired by
using the optional flag -O followed by the desired file name. So our complete curl command would look like:
curl -X POST "https://search.api.hubmapconsortium.org/v3/search?produce-clt-manifest=true" -H "Authorization: Bearer <token>" -H "Content-Type: application/json" -d '{"query": {"bool": {"must": [{"match_phrase": {"donor.group_name": "Vanderbilt TMC"}}], "filter": [{"match": {"entity_type.keyword": "Dataset"}}]}}}' -o manifest.txt
Where <token>
is your Globus Groups
token. This will download a manifest file with the desired name and location which can now be used with the HuBMAP-CLT.
Example:
hubmap-clt transfer manifest.txt
More information about using the HuBMAP-CLT can be found here.