Getting Started with the Select State.gov Data (SSD) API

Last updated: 08/07/2014

Overview

The Select State.gov Data (SSD) API is a REST-based API for accessing the content and metadata posted on www.state.gov. Because the SSD API is a REST-based API, it can be accessed from a client or server-side application. You can use any programming language, because you just need to be able to make HTTP GET requests, and handle the responses that are in the form of JSON.

Note: A token is not needed to begin using the API.

The REST API

REST, or Representational State Transfer, is a standard way of accessing data stored in a remote system over HTTP. It is one of the technologies that powers web services. Its primary job is to encapsulate the internal workings of the complex data transfer process, thereby shielding the developer from needing to understand them. The only thing your code needs to understand is the format of the data that is sent back.

  • The Read API consists of a set of commands that perform a variety of queries on our servers and return sets of data in DTOs or Data Transfer Objects. For example, one API command is get_commands, which returns an array of commands, where each command contains a name and a description. The return data is formatted as JSON strings. JSON (JavaScript Object Notation) is a lightweight way of transferring complex objects as strings, and nearly every language today has built-in or publicly available libraries to parse JSON strings into native objects.

    Although URLs for various www.state.gov content items change infrequently (eg. content_url or site_url in a JSON response), it is still recommended to dynamically retrieve URLs of content used in your developed application on a regular basis since they may have changed without notice.

The API, version 1.0, is located at http://www.state.gov/api/v1, but you will need to provide a valid command as described in the sections below.

Calling commands

Command calls using REST are basically HTTP GET requests for read commands to a particular URL on the API server. The request includes the name of the command you are calling, with its input arguments, which are passed as parameters in the URL. The body of the HTTP response contains the results of the HTTP call as a JSON string. All SSD API calls use the base URL http://www.state.gov/api/v1.

For example, to retrieve all the Secretary appointment schedules, you make an HTTP request that looks like:

http://www.state.gov/api/v1?command=get_appt_schedules

Note that if you call get_appt_schedules without search parameters, all appointment schedules are returned (in blocks of 100) and sorted in reverse chronological order with the most recent being first.

This is a working request — you can try it here. You will see the results of the call print out. This is a raw command call. What comes back is unprocessed and is not formatted. In the examples, you will see ways to take the returned data and shape it in ways that are useful for an application.

Incorporating the API

Because API requests are simple HTTP calls, you can include them just about anywhere in your application. Every popular language for the web, server- or client-side, has syntax for making HTTP requests. These are what you use to include API calls in your application.

Chances are your request will be made on-demand — for example, when a page loads, when a user clicks a button, or when some other event occurs. To handle this, wrap the HTTP request in a function and call it in response to some programmatic event, like an onClick event. You must also handle the response and, if you are not working in JavaScript, parse the JSON string output into native objects so that you can work with the data. Be aware that JSON does not reliably return data in a particular order; we suggest you use a JSON parser to get hold of the fields you want to work with.

API parameters

The basic Read API calls, like get_appt_schedules, can be refined with additional parameters. The API reference lists the complete set, but here is an overview of just a few of the things you can do:

  • Page the search results. You can opt to return pages of data if you expect a large number of results. You can set the page size and select a specific page to be returned. The maximum page size is 300; if you don't set the page size, results are returned in pages of 100. For example, get_appt_schedules&per_page=10. Try it here.
  • Select fields for the return set. In most use cases you only need specific fields of data, like the title and the site_url fields. In the case of get_appt_schedules, those fields are returned by default but you can request additional fields as well, separated by commas. For example, get_appt_schedules&fields=terms,content_url. Try it here.

Note that sorting the results from the SSD API is currently handled automatically. Since most content is date-driven, results are sorted by date in descending order with the most recent item being first. In cases where the date has not been assigned to the content, the results are sorted alphabetically. For a full description, please consult the API reference.

Working with JSON

When you work with JSON strings, make sure you use the appropriate JSON syntax. String values are enclosed in quotes, while numbers and boolean values are not. For example, in the following response to the get_country_fact_sheets&per_page=1 command, the title field is a string, success is a boolean, and the page, per_page, pages, page_record_count, total_record_count and id fields are numbers:

{
    "api_version": "1.0",
    "success": true,
    "page": 0,
    "per_page": 1,
    "pages": 202,
    "page_record_count": 1,
    "total_record_count": 202,
    "country_fact_sheets": [
        {
            "id": 5380,
            "title": "Afghanistan"
        }
    ]
}

Note that slashes "/" returned in a JSON string are escaped with a backslash "\".

The API supports both raw JSON (as seen in the example above) and JSONP. This enables your application to use server side calls as well as client side calls with languages, such as JavaScript, to access the API data.

The JSONP format is similar to the JSON format except the response is wrapped in a request specified callback function. This allows JavaScript to call the state.gov server without violating the same origin policy. When making a request for a JSONP response add the argument "callback" to your request with the name of your callback function. This function name wraps the response so that it is executed on completion of the response.

Here is an example of the same request seen above, but with the callback argument added with the value of myCallbackFunction get_country_fact_sheets&per_page=1&callback=myCallbackFunction

myCallbackFunction({
    "api_version": "1.0",
    "success": true,
    "page": 0,
    "per_page": 1,
    "pages": 202,
    "page_record_count": 1,
    "total_record_count": 202,
    "country_fact_sheets": [
        {
            "id": 5380,
            "title": "Afghanistan"
        }
    ]
});

Review the API Reference for the correct syntax for each command you use. For more information about JSON, visit json.org.

Content availability

To enhance performance, API calls are performed on a secondary, replicated server. When a call is made, the results are pulled from that secondary source. The frequency of data replication from the original to the API server varies according to several factors that may be subject to change. Generally, you can expect content to be available within 60 minutes of its original post to www.state.gov.

Error handling

The API tries to catch common errors, such as a URL parameter that doesn't exist, and handle them in a code-friendly way. Errors are returned as JSON strings with a message parameter like this:

{"api_version": "1.0","success":false,"code":200,"message":"Invalid argument"}

When the problem is a warning and not an error, the API still returns results, but with a warning message array. For example, with the request http://www.state.gov/api/v1?command=get_appt_schedules&fields=author,title&per_page=1, we can see that author is an invalid value for the fields parameter, but we are still getting results to the overall request as shown below:

{
    "api_version": "1.0",
    "success": true,
    "warning": [
        "Invalid fields: author"
    ],
    "page": 0,
    "per_page": 1,
    "pages": 1197,
    "page_record_count": 1,
    "total_record_count": 1197,
    "appt_schedules": [
        {
            "title": "October 21 - Tuesday"
        }
    ]
}

As a standard practice, your code should always try to handle errors gracefully rather than letting the end-user deal with a cryptic message or blank page. By handling the "message" parameter and the "warning" parameter of the result object (after JSON-parsing the response), you will know what might have gone wrong and can act upon it by programming an error message for the end-user.

Error messages returned by the API include a numerical error code that classifies errors by type. For more information, see the Error Message Reference.

Strategies for retrying after errors occur

Depending on the specific error you get from the API your retry strategy should be different.

First, always use a single thread for requests and keep your requests to just the information you need to complete the action.

If you get a timeout error, modify your request to target only the data that you need. Wait a minute or so, then retry.

For any transient errors, immediate retry is probably the best solution.

Limitations

  • The API provides access to some of the more common meta data that was originally attached to the content, but not all meta data.
  • For content available only in HTML, the API does not breakdown such results into more discreet fields.
  • Sorting the results from the SSD API is currently handled automatically. Since most content is date-driven, results are sorted by date in descending order with the most recent item being first. In cases where the date has not been assigned to the content, the results are sorted alphabetically. For more information, please consult the API reference.
  • Some HTML content will be very large. While the API provides a means to page through the number of responses to a request, it does not provide a way to page through specific fields of a single content item that was returned. Developers may wish to parse, or otherwise limit, the HTML chunks that are returned.
  • Data prior to 2009 is not available via the API.
  • Only a select portion of content is currently available through the API. More content types will gradually be added over time.