curl --request GET \
--url https://api.trieve.ai/api/crawl \
--header 'Authorization: <api-key>' \
--header 'TR-Dataset: <tr-dataset>'
[
{
"attempt_number": 123,
"crawl_options": {
"crawl_options": {
"allow_external_links": false,
"boost_titles": true,
"exclude_tags": [
"#ad",
"#footer",
"header",
"head",
"navbar",
"footer",
"aside",
"nav",
"form"
],
"heading_remove_strings": [
"Advertisement",
"Sponsored"
],
"ignore_sitemap": true,
"include_tags": [],
"interval": "daily",
"limit": 50,
"site_url": "nedzo.ai"
}
},
"crawl_type": "firecrawl",
"created_at": "2023-11-07T05:31:56Z",
"dataset_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"interval": "<string>",
"next_crawl_at": "2023-11-07T05:31:56Z",
"scrape_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"status": "Pending",
"url": "<string>"
}
]
This endpoint is used to get all crawl requests for a dataset.
curl --request GET \
--url https://api.trieve.ai/api/crawl \
--header 'Authorization: <api-key>' \
--header 'TR-Dataset: <tr-dataset>'
[
{
"attempt_number": 123,
"crawl_options": {
"crawl_options": {
"allow_external_links": false,
"boost_titles": true,
"exclude_tags": [
"#ad",
"#footer",
"header",
"head",
"navbar",
"footer",
"aside",
"nav",
"form"
],
"heading_remove_strings": [
"Advertisement",
"Sponsored"
],
"ignore_sitemap": true,
"include_tags": [],
"interval": "daily",
"limit": 50,
"site_url": "nedzo.ai"
}
},
"crawl_type": "firecrawl",
"created_at": "2023-11-07T05:31:56Z",
"dataset_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"interval": "<string>",
"next_crawl_at": "2023-11-07T05:31:56Z",
"scrape_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"status": "Pending",
"url": "<string>"
}
]
The dataset id to use for the request
The page number to retrieve
The number of items to retrieve per page
Crawl requests retrieved successfully
The response is of type object[]
.
Was this page helpful?