Once you've mastered the concepts here, you can use our API to fully automate and integrate with the platform.
Detailed examples on how to use our API to access the Stevesie platform.
Non-public endpoints to the Stevesie API (to manage your account, run workflows, execute entpionds, etc...) will require an authentication token.
You can execute third-party endpoints we provide integrations for via API so your client programs have a clean and safe way to access individual third party endpoints behind a proxy. This also lets you enter the inputs to the endpiont and our API will automatically templatize them into the proper request for you.
Execute an endpiont directly through our API and get back the raw response from the third party.
POST https://stevesie.com/cloud/api/v1/endpoints/Endpoint ID/executions
Headers
application/json
Body
{ "inputs": { ... }, "proxy": { "type": "shared", "location": "nyc" }, "format": "json" }
Notes
The inputs
key is optional and should contain a key-value pair of inputs to send to the third-party API, corresponding to the endpoint's variable names.
Please see the specific endpoint you're interested in executing and scroll down to view instructions under the "API" tab.
Proxy Locations
nyc |
🇺🇸 New York, United States |
sfo |
🇺🇸 San Francisco, United States |
ams |
🇳🇱 Amsterdam, Netherlands |
sgp |
🇸🇬 Singapore, Singapore |
lon |
🇬🇧 London, United Kingdom |
fra |
🇩🇪 Frankfurt, Germany |
tor |
🇨🇦 Toronto, Canada |
blr |
🇮🇳 Bangalore, India |
{ "object": { "inputs": { ... }, "proxy": { "type": "shared", "location": "nyc" }, "response": { "status_code": 200, "headers": { ... }, "response_json": { ... } } } }
You can launch workflows via API, check on the status, and get the URLs to download result files over API as well.
Create a new execution for a given workflow.
You'll be returned the new execution's ID that you can use to monitor the status.
You can also optionally provide a webhook_url
that we will make a POST request to upon completion, sending the body data as workflow_execution_id=workflow_execution_id
POST https://stevesie.com/cloud/api/v1/workflows/Workflow ID/executions
Headers
application/json
Body
{ "execution_parameters": { ... }, "proxy": { "type": "dedicated", "location": "nyc" }, # FOLLOWING ARE OPTIONAL "output_aggregation_format": "csv", "maximum_requests_count": 10000, "do_proxy_hot_swap": false, "execution_name": "...", "rate_limit_sleep_between_requests_seconds": 3, "rate_limit_sleep_on_rate_limited_seconds": 300, "rate_limit_batch_size": 100, "rate_limit_batch_wait_seconds": 300, "webhook_url": "https://yourservice.com/callback-url", }
Notes
The execution_parameters
key is optional and should contain a key-value pair of inputs to send to the third-party API via the workflow, corresponding to the endpoint's variable names.
Please see the specific workflow you're interested in executing and scroll down to view instructions under the "API" tab.
Note that you can also pass an array as the value to any key to signify processing a collection (as an alternative to using input collections).
E.g. you could pass the following as execution_parameters
to process a list of "username" variables in the workflow:
{ "username": ["user1", "user2"] }
Proxy Locations
nyc |
🇺🇸 New York, United States |
sfo |
🇺🇸 San Francisco, United States |
ams |
🇳🇱 Amsterdam, Netherlands |
sgp |
🇸🇬 Singapore, Singapore |
lon |
🇬🇧 London, United Kingdom |
fra |
🇩🇪 Frankfurt, Germany |
tor |
🇨🇦 Toronto, Canada |
blr |
🇮🇳 Bangalore, India |
{
"object": {
"id": "Workflow Execution ID",
"created_at": "..."
}
}
import requests from datetime import datetime API_TOKEN = 'XXX' WORKFLOW_ID = 'WORKFLOW_ID' URL_BASE = 'https://stevesie.com/cloud/api/v1' auth_headers = { 'Token': API_TOKEN, } MIN_DATE = '2019-01-01' MAX_DATE = '2019-02-01' min_date_timestamp = int(datetime.strptime(MIN_DATE, '%Y-%m-%d').timestamp()) max_date_timestamp = int(datetime.strptime(MAX_DATE, '%Y-%m-%d').timestamp()) execution = requests.post( '{}/workflows/{}/executions'.format(URL_BASE, WORKFLOW_ID), json={ 'proxy': { 'type': 'dedicated', 'location': 'nyc', }, 'execution_parameters': { 'min_timestamp': min_date_timestamp, 'max_timestamp': max_date_timestamp, }, 'do_proxy_hot_swap': False, }, headers=auth_headers, ).json() print(execution)
Check the status of a workflow and if finished, get the URLs to download the results.
If the execution was large and split into batches, you want to wait until the is_waiting_batches
flag returns false, and then use the batch_results
array instead to get the full results.
GET https://stevesie.com/cloud/api/v1/workflows/executions/Workflow Execution ID
Headers
{ "object": { "id": "...", "workflow_id": "...", "execution_name": "...", "execution_parameters": { ... }, "created_at": "...", "completed_at": "...", "results": [ { "id": "...", "url": "...", "filename": "...", "filesize_bytes": 1000, "item_count": 10, } ], "proxy": { "type": "shared|dedicated|custom", "location": "...", "proxy_id": "...", "url": "..." }, "requests_count": 10, "requests_avg_response_bytes": 10000, "rate_limit_sleep_between_requests_seconds": "3.00", "rate_limit_sleep_on_rate_limited_seconds": "300", "rate_limit_batch_size": 100, "rate_limit_batch_wait_seconds": "300", "completion_reason": "finished|insufficient_balance|excessive_rate_limiting|pagination_no_extractions|no_extractions", "completion_reason_details": "...", "downstream_executions": [ { "id": "...", "workflow_id": "...", "workflow_name": "...", "execution_name": "...", "created_at": "...", "completed_at": "...", "results": [ ... ], "downstream_executions": [ ... ] } ], "webhook_url": "https://yourservice.com/callback-url", "is_waiting_batches": false, "batch_results": [ { "batch_number": 1, "results": [ { "id": "...", "url": "...", "filename": "...", "filesize_bytes": 1000, "item_count": 10, } ] }, { "batch_number": 2, "results": [ { "id": "...", "url": "...", "filename": "...", "filesize_bytes": 1000, "item_count": 10, } ] } ] } }
import requests API_TOKEN = 'XXX' EXECUTION_ID = 'EXECUTION_ID' URL_BASE = 'https://stevesie.com/cloud/api/v1' auth_headers = { 'Token': API_TOKEN, } status = requests.get( '{}/workflows/executions/{}'.format(URL_BASE, EXECUTION_ID), headers=auth_headers, ).json() print(status)
List all of the executions for a given workflow.
GET https://stevesie.com/cloud/api/v1/workflows/Workflow ID/executions
Headers
Query Parameters
active
or finished
YYYY-MM-DD HH:MM:SS UTC
YYYY-MM-DD HH:MM:SS UTC
{ "objects": [ { "id": "...", "workflow_id": "...", "execution_name": "...", "execution_parameters": { ... }, "created_at": "...", "completed_at": "...", "results": [ { "id": "...", "url": "...", "filename": "...", "filesize_bytes": 1000, "item_count": 10, } ], "rate_limit_sleep_between_requests_seconds": "3.00", "rate_limit_sleep_on_rate_limited_seconds": "300", "rate_limit_batch_size": 100, "rate_limit_batch_wait_seconds": "300", "downstream_executions": [ { "id": "...", "workflow_id": "...", "workflow_name": "...", "execution_name": "...", "created_at": "...", "completed_at": "...", "results": [ ... ], "downstream_executions": [ ... ] } ] } ] }
List all the workflows in your account.
GET https://stevesie.com/cloud/api/v1/workflows
Headers
{ "objects": [ { "id": "...", "name": "..." } ] }
import requests API_TOKEN = 'XXX' URL_BASE = 'https://stevesie.com/cloud/api/v1' auth_headers = { 'Token': API_TOKEN, } workflows = requests.get( '{}/workflows'.format(URL_BASE), headers=auth_headers, ).json() print(workflows)
Delete a workflow from your account.
DELETE https://stevesie.com/cloud/api/v1/workflows/Workflow ID
Headers
{}
If you're doing advanced work with workflows and prefer to manually manage the proxies, you will want to launch and terminate proxies using the API.
Create a new dedicated proxy with its own IP address for exclusive use.
POST https://stevesie.com/cloud/api/v1/proxies
Headers
Body (Optional)
{ "location": "nyc" }
Proxy Locations
nyc |
🇺🇸 New York, United States |
sfo |
🇺🇸 San Francisco, United States |
ams |
🇳🇱 Amsterdam, Netherlands |
sgp |
🇸🇬 Singapore, Singapore |
lon |
🇬🇧 London, United Kingdom |
fra |
🇩🇪 Frankfurt, Germany |
tor |
🇨🇦 Toronto, Canada |
blr |
🇮🇳 Bangalore, India |
{ "object": { "id": "...", "ip_address": "...", "intercept_port": "..." } }
import requests r = requests.post( 'https://stevesie.com/cloud/api/v1/endpoints/proxies', json={'location': 'nyc'}, headers={ 'Token': 'XXX', }, ) response_json = r.json() print(response_json)
Terminate a specific proxy.
DELETE https://stevesie.com/cloud/api/v1/proxies/Proxy ID
Headers
{}
If you prefer to manage workflows using input collections, you can declare new collections over API and add and remove items as well.
Create a new collection
POST https://stevesie.com/cloud/api/v1/collections
Headers
Body
name=collection_name&items[]=value1&items[]=value2
{ "object": { "id": "...", "name": "collection_name", "count": 2 } }
List all collections
GET https://stevesie.com/cloud/api/v1/collections
Headers
{ "objects": [ { "id": "...", "name": "collection_name", "count": 2 } ] }
Edit the contents of a collection
POST https://stevesie.com/cloud/api/v1/collections/Collection ID
Headers
Body
{ "method": "add|remove|set", "items": [ ... ] }
Notes
Method should be either add
, remove
, or set
.
You can either provide a list of strings if a simple collection, e.g. [ "value1", "value2" ]
or a list of JSON values, e.g. [ { "latitude": "123.56", "longitude": "456.33" }, { "latitude": "120.99", "longitude": "460.20" } ]
{ "object": { "id": "...", "name": "collection_name", "count": 2 } }
Delete a collection
DELETE https://stevesie.com/cloud/api/v1/collections/Collection ID
Headers
{}
You can view a list of apps available in the public app directory using a public API endpoint.
Show all of the apps publicly listed.
GET https://stevesie.com/cloud/api/v1/apps
{ "objects": [ { "id": "0146ad90-4aee-44dd-bd75-6153a7472834", "name": "Twitter", "website": "https://developer.twitter.com/en/docs", "domain": "twitter.com", "slug": "twitter", "is_api_wrapper": true, "is_featured": false, "category": "social_media" } ] }