Data Extractors

Understand how Stevesie automatically parses back responses into collections.

When an API response returns back JSON (most modern APIs), then the Stevesie platform will automatically convert the response into more usable "collections."

JSON to Collections

Let's say an API returns the following JSON back to use:

  {
    "results": [
      {
        "id": 1,
        "name": "Alice",
        "children": [
          "Carl",
          "Dave"
        ]
      },
      {
        "id": 2,
        "name": "Bob",
        "children": [
          "Eric",
          "Fred"
        ]
      }
    ]
  }

The Stevesie platform will automatically "flatten" this structure into "collections" by identifying where the JSON structure begins repeating common data patterns within the hierarcy.

In this example, it will identify two collections defined by their JSON lookup paths: results and results.children.

The results collection will be represented as follows, displaying the first value for its child collections:

results.id	results.name	results.children[0]
1	Alice	Carl
2	Bob	Eric

Likewise, the collection data for results.children will be as follows, where the end columns refer back up to the parent object in the hierarchy.

results.children	results.id	results.name
Carl	1	Alice
Dave	1	Alice
Eric	2	Bob
Fred	2	Bob

This method allows us to quickly "de-normalize" the child data so we can view it in a standard format with the relationships to parents intact.

Downloading Results

After executing an endpoint on the Stevesie platform, you'll be presented back the parsed out collections with the ability to scroll through them as previews or download the data.

Expanded CSV

If you'd like to preserve the child-parent relationships, then you'll want to download using "Expanded CSV" format, which will return back CSV files corresponding to the tables outlined above.

JSON Format

You can download a collection in JSON format, which will send back data different from the original data.

For example, the JSON format for the results collection above will look like this:

  [
    {
      "id": 1,
      "name": "Alice",
      "children[0]": "Carl"
    },
    {
      "id": 2,
      "name": "Bob",
      "children[0]": "Eric"
    }
  ]

The JSON format of the results.children collection would be as follows (note that it removes the parent relations in this format).

  [
    {
      "children": "Carl"
    },
    {
      "children": "Dave"
    },
    {
      "children": "Eric"
    },
    {
      "children": "Fred"
    }
  ]

Once you find the collections you're intersted in (e.g. either of the results or results.children collections above), you can create an "extractor" that remembers where this data lives in the response so you can quickly extract it in future runs.

After executing an endpoint and seeing the parsed results, just click the arrow button next to the collection you'd like to make an extractor for and select "New Extractor."

Output Fields

By default, extractors will automatically convert all JSON keys found in the response to columns (or keys) for the data. This ensures that you always get ALL the data back, but may be a little strange sometimes when you get a different number of columns back, or they're in a different order.

If you have a list of columns that you only care about and would rather just extract those, you can declare a "whitelist" of clumns you only want back.

Row Filters

You can also configure extractors to only save records that match a specific criteria - e.g. a particlular field such as "follower count" is above or below a threshold.

Next: Workflow Basics »