/api/public/v1/health
Returns platform health, store mode, index prefix, and storage status.
External API Guide
External systems can query platform status, sources, jobs, latest runs, and use safe simulation endpoints for crawling, parsing, deduplication, compliance, and scheduling logic.
Base URL: https://crawler.sun-bd.com
Response envelope:
{
"success": true,
"traceId": "00-...",
"data": { },
"warnings": []
}
This project does not provide local member login or registration. Protected management APIs require the access token returned by the central member API. Public guide endpoints remain available for demonstrations and safe simulations.
Returns platform health, store mode, index prefix, and storage status.
Lists available APIs, methods, purposes, and sample payloads.
Returns platform capability summaries for product pages or external clients.
Reads source URL, legal status, robots status, and scheduling details.
Reads jobs, cadence, enabled state, source mapping, and parser rules.
Returns latest worker run status, including success, failure, HTTP 403, CAPTCHA, and quality indicators.
Runs a crawl dry-run and returns estimated records, warnings, and quality summary.
Suggests CSS selector and XPath rules from a URL or sample HTML.
Checks whether content is likely duplicate through hash and similarity checks.
Checks whether robots.txt allows a specified user-agent to crawl a URL.
Generates upcoming run times from a cron expression.
curl https://crawler.sun-bd.com/api/public/v1/health
curl https://crawler.sun-bd.com/api/public/v1/catalog
curl "https://crawler.sun-bd.com/api/public/v1/sources?tenantId=tenant-hq"
curl -X POST https://crawler.sun-bd.com/api/public/v1/crawl/simulate \
-H "Content-Type: application/json" \
-d "{\"tenantId\":\"tenant-hq\",\"jobId\":\"job-news-demo\",\"maxRecords\":8}"