REST API

Python representation of the DataSift REST API.

http://dev.datasift.com/docs/rest-api

datasift.client module

Core REST API.

class datasift.client.Client(*args, **kwargs)[source]

Bases: object

Datasift client class.

Used to interact with the DataSift REST API.

Parameters:
  • user (str) – username for the DataSift platform
  • apikey (str) – API key for the DataSift platform
  • ssl (bool) – (optional) whether to enable SSL, default is True
  • proxies (dict) – (optional) dict of proxies for requests to use, of the form {“https”: “http://me:password@myproxyserver:port/” }
  • timeout (float) – (optional) seconds to wait for HTTP connections
  • verify (bool) – (optional) whether to verify SSL certificates
  • api_host (str) – (optional) to change from the default DataSift host
  • api_version (str) – (optional) to change from the default DataSift version
  • async (bool) – (optional) specifies if this client should go into async mode, defaults to False
  • max_workers (int) – (optional) maximum number of worker threads to use while in async mode, defaults to 10
Variables:
balance()[source]

Determine your credit or DPU balance

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/balance

Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
compile(csdl)[source]

Compile the given CSDL.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/compile

Raises a DataSiftApiException for any error given by the REST API, including CSDL compilation.

Parameters:csdl (str) – CSDL to compile
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
dpu(hash=None, historics_id=None)[source]

Calculate the DPU cost of consuming a stream.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/dpu

Parameters:hash (str) – target CSDL filter hash
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
is_valid(csdl)[source]

Checks if the given CSDL is valid.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/validate

Parameters:csdl (str) – CSDL to validate
Returns:Boolean indicating the validity of the CSDL
Return type:bool
Raises:DataSiftApiException, requests.exceptions.HTTPError
on_closed(func)[source]

Function to set the callback for the closing of a stream.

Can be called manually:

def close_callback():
    teardown_stream()
client.on_close(close_callback)

or as a decorator:

@client.on_close
def close_callback():
    teardown_stream()
on_delete(func)[source]

Function to set the callback for the deletion of an item on an active stream.

Can be called manually:

def delete_callback(interaction):
    delete(interaction)
client.on_delete(delete_callback)

or as a decorator:

@client.on_delete
def delete_callback(interaction):
    delete(interaction)
on_ds_message(func)[source]

Function to set the callback for an incoming interaction.

Can be called manually:

def message_callback(interaction):
    process(interaction)
client.on_ds_message(message_callback)

or as a decorator:

@client.on_ds_message
def message_callback(interaction):
    process(interaction)
on_open(func)[source]

Function to set the callback for the opening of a stream.

Can be called manually:

def open_callback(data):
    setup_stream()
client.on_open(open_callback)

or as a decorator:

@client.on_open
def open_callback():
    setup_stream()
pull(subscription_id, size=None, cursor=None)[source]

Pulls a series of interactions from the queue for the given subscription ID.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pull

Parameters:
  • subscription_id (str) – The ID of the subscription to pull interactions for
  • size (int) – the max amount of data to pull in bytes
  • cursor (str) – an ID to use as the point in the queue from which to start fetching data
Returns:

dict with extra response data

Return type:

ResponseList

Raises:

DataSiftApiException, requests.exceptions.HTTPError

start_stream_subscriber()[source]

Starts the stream consumer’s main loop.

Called when the stream consumer has been set up with the correct callbacks.

subscribe(stream)[source]

Subscribe to a stream.

Parameters:stream (str) – stream to subscribe to
Raises:StreamSubscriberNotStarted, DeleteRequired, StreamNotConnected

Used as a decorator, eg.:

@client.subscribe(stream)
def subscribe_to_hash(msg):
    print(msg)
usage(period='hour')[source]

Check the number of objects processed and delivered for a given time period

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/usage

Parameters:period (str) – (optional) time period to measure usage for, can be one of “day”, “hour” or “current” (5 minutes), default is hour
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
validate(csdl)[source]

Checks if the given CSDL is valid.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/validate

Parameters:csdl (str) – CSDL to validate
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError

datasift.push module

Push REST API.

class datasift.push.Push(request)[source]

Bases: object

create_from_hash(stream, name, output_type, output_params, initial_status=None, start=None, end=None)[source]

Create a new push subscription using a live stream.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushcreate

Parameters:
  • stream (str) – The hash of a DataSift stream.
  • name (str) – The name to give the newly created subscription
  • output_type (str) – One of the supported output types e.g. s3
  • output_params (dict) – The set of parameters required for the given output type
  • initial_status (str) – The initial status of the subscription, active, paused or waiting_for_start
  • start (int) – Optionally specifies when the subscription should start
  • end (int) – Optionally specifies when the subscription should end
Returns:

dict with extra response data

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

create_from_historics(historics_id, name, output_type, output_params, initial_status=None, start=None, end=None)[source]

Create a new push subscription using the given Historic ID.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushcreate

Parameters:
  • historics_id (str) – The ID of a Historics query
  • name (str) – The name to give the newly created subscription
  • output_type (str) – One of the supported output types e.g. s3
  • output_params (dict) – set of parameters required for the given output type, see dev.datasift.com
  • initial_status (str) – The initial status of the subscription, active, paused or waiting_for_start
  • start (int) – Optionally specifies when the subscription should start
  • end (int) – Optionally specifies when the subscription should end
Returns:

dict with extra response data

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

delete(subscription_id)[source]

Delete the subscription for the given ID.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushdelete

Parameters:subscription_id (str) – id of an existing Push Subscription.
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
get(subscription_id=None, stream=None, historics_id=None, page=None, per_page=None, order_by=None, order_dir=None, include_finished=None)[source]

Show details of the Subscriptions belonging to this user.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushget

Parameters:
  • subscription_id (str) – optional id of an existing Push Subscription
  • hash (str) – optional hash of a live stream
  • playback_id (str) – optional playback id of a Historics query
  • page (int) – optional page number for pagination
  • per_page (int) – optional number of items per page, default 20
  • order_by (str) – field to order by, default request_time
  • order_dir (str) – direction to order by, asc or desc, default desc
  • include_finished (bool) – boolean indicating if finished Subscriptions for Historics should be included
Returns:

dict with extra response data

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

log(subscription_id=None, page=None, per_page=None, order_by=None, order_dir=None)[source]

Retrieve any messages that have been logged for your subscriptions.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushlog

Parameters:
  • subscription_id (str) – optional id of an existing Push Subscription, restricts logs to a given subscription if supplied.
  • page (int) – optional page number for pagination
  • per_page (int) – optional number of items per page, default 20
  • order_by (str) – field to order by, default request_time
  • order_dir (str) – direction to order by, asc or desc, default desc
Returns:

dict with extra response data

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

pause(subscription_id)[source]

Pause a Subscription and buffer the data for up to one hour.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushpause

Parameters:subscription_id (str) – id of an existing Push Subscription.
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
resume(subscription_id)[source]

Resume a previously paused Subscription.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushresume

Parameters:subscription_id (str) – id of an existing Push Subscription.
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
stop(subscription_id)[source]

Stop the given subscription from running.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushstop

Parameters:subscription_id (str) – id of an existing Push Subscription.
Returns:dict with extra response data
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
update(subscription_id, output_params, name=None)[source]

Update the name or output parameters for an existing Subscription.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushupdate

Parameters:
  • subscription_id (str) – id of an existing Push Subscription.
  • output_params (dict) – new output parameters for the subscription, see dev.datasift.com
  • name (str) – optional new name for the Subscription
Returns:

dict with extra response data

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

validate(output_type, output_params)[source]

Check that a subscription is defined correctly.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/pushvalidate

Parameters:
Returns:

dict with extra response data

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.historics module

Historics REST API.

class datasift.historics.Historics(request)[source]

Bases: object

Represents the DataSift Historics REST API and provides the ability to query it. Internal class instantiated as part of the Client object.

delete(historics_id)[source]

Delete one specified playback query. If the query is currently running, stop it.

status_code is set to 204 on success

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsdelete

Parameters:historics_id (str) – playback id of the query to delete
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
get(historics_id=None, maximum=None, page=None, with_estimate=None)[source]

Get the historics query with the given ID, if no ID is provided then get a list of historics queries.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsget

Parameters:
  • historics_id (str) – (optional) ID of the query to retrieve
  • maximum (int) – (optional) maximum number of queries to recieve (default 20)
  • page (int) – (optional) page to retrieve for paginated queries
  • with_estimate (bool) – include estimate of completion time in output
  • historics_id – playback id of the query
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

get_for(historics_id, with_estimate=None)[source]

Get the historic query for the given ID

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsget

Parameters:historics_id (str) – playback id of the query
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
pause(historics_id, reason='')[source]

Pause an existing Historics query.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicspause

Parameters:
  • historics_id (str) – id of the job to pause
  • reason (str) – optional reason for pausing it
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

prepare(hash, start, end, name, sources, sample=None)[source]

Prepare a historics query which can later be started.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsprepare

Parameters:
  • hash (str) – The hash of a CSDL create the query for
  • start (int) – when to start querying data from - unix timestamp
  • end (int) – when the query should end - unix timestamp
  • name (str) – the name of the query
  • sources (list) – list of sources e.g. [‘facebook’,’bitly’,’tumblr’]
  • sample (int) – percentage to sample, either 10 or 100
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

HistoricSourcesRequired, DataSiftApiException, requests.exceptions.HTTPError

resume(historics_id)[source]

Resume a paused Historics query.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsresume

Parameters:historics_id (str) – id of the job to resume
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
start(historics_id)[source]

Start the historics job with the given ID.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsstart

Parameters:historics_id (str) – hash of the job to start
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
status(start, end, sources=None)[source]

Check the data coverage in the Historics archive for a given interval.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsstatus

Parameters:
  • start (int) – Unix timestamp for the start time
  • end (int) – Unix timestamp for the start time
  • sources (list) – list of data sources to include.
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

stop(historics_id, reason='')[source]

Stop an existing Historics query.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsstop

Parameters:
  • historics_id (str) – playback id of the job to stop
  • reason (str) – optional reason for stopping the job
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

update(historics_id, name)[source]

Update the name of the given Historics query.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/historicsupdate

Parameters:
  • historics_id (str) – playback id of the job to start
  • name (str) – new name of the stream
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.historics_preview module

Historics Preview REST API.

class datasift.historics_preview.HistoricsPreview(request)[source]

Bases: object

Represents the DataSift Historics Preview REST API and provides the ability to query it. Internal class instantiated as part of the Client object.

create(stream, start, parameters, sources, end=None)[source]

Create a hitorics preview job.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/previewcreate

Parameters:
  • stream (str) – hash of the CSDL filter to create the job for
  • start (int) – Unix timestamp for the start of the period
  • parameters (list) – list of historics preview parameters, can be found at http://dev.datasift.com/docs/api/rest-api/endpoints/previewcreate
  • sources (list) – list of sources to include, eg. [‘tumblr’,’facebook’]
  • end (int) – (optional) Unix timestamp for the end of the period, defaults to min(start+24h, now-1h)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

HistoricSourcesRequired, DataSiftApiException, requests.exceptions.HTTPError

get(preview_id)[source]

Retrieve a Historics preview job.

Warning: previews expire after 24 hours.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/previewget

Parameters:preview_id (str) – historics preview job hash of the job to retrieve
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError

datasift.managed_sources module

Managed Sources REST API.

class datasift.managed_sources.Auth(request)[source]

Bases: object

Represents the Auth section of the DataSift Managed Sources REST API and provides the ability to query it. Internal class instantiated as part of the Client object.

add(source_id, auth, validate=True)[source]

Add one or more sets of authorization credentials to a Managed Source

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourceauthadd

Parameters:
  • source_id (str) – target Source ID
  • auth (array of strings) – An array of the source-specific authorization credential sets that you’re adding.
  • validate (bool) – Allows you to suppress the validation of the authorization credentials, defaults to true.
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

remove(source_id, auth_ids)[source]

Remove one or more sets of authorization credentials from a Managed Source

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourceauthremove

Parameters:
  • source_id (str) – target Source ID
  • resources (array of str) – An array of the authorization credential set IDs that you would like to remove.
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

class datasift.managed_sources.ManagedSources(request)[source]

Bases: object

Represents the DataSift Managed Sources REST API and provides the ability to query it. Internal class instantiated as part of the Client object.

create(source_type, name, resources, auth=None, parameters=None, validate=True)[source]

Create a managed source

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourcecreate

Parameters:
  • source_type (str) – data source name e.g. facebook_page, googleplus, instagram, yammer
  • name (str) – name to use to identify the managed source being created
  • resources (list) – list of source-specific config dicts
  • auth (list) – list of source-specific authentication dicts
  • parameters (dict) – (optional) dict with config information on how to treat each resource
  • validate (bool) – bool to determine if validation should be performed on the source
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

delete(source_id)[source]

Delete a managed source.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourcedelete

Parameters:source_id (str) – target Source ID
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
get(source_id=None, source_type=None, page=None, per_page=None)[source]

Get a specific managed source or a list of them.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourceget

Parameters:
  • source_id (str) – (optional) target Source ID
  • source_type (str) – (optional) data source name e.g. facebook_page, googleplus, instagram, yammer
  • page (int) – (optional) page number for pagination, default 1
  • per_page (int) – (optional) number of items per page, default 20
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

log(source_id, page=None, per_page=None)[source]

Get the log for a specific Managed Source.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourcelog

Parameters:
  • source_id (str) – target Source ID
  • page (int) – (optional) page number for pagination
  • per_page (int) – (optional) number of items per page, default 20
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

start(source_id)[source]

Start consuming from a managed source.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourcestart

Parameters:source_id (str) – target Source ID
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
stop(source_id)[source]

Stop a managed source.

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourcestop

Parameters:source_id (str) – target Source ID
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
update(source_id, source_type, name, resources, auth, parameters=None, validate=True)[source]

Update a managed source

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourceupdate

Parameters:
  • source_type (str) – data source name e.g. facebook_page, googleplus, instagram, yammer
  • name (str) – name to use to identify the managed source being created
  • resources (list) – list of source-specific config dicts
  • auth (list) – list of source-specific authentication dicts
  • parameters (dict) – (optional) dict with config information on how to treat each resource
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

class datasift.managed_sources.Resource(request)[source]

Bases: object

Represents the Resource section of the DataSift Managed Sources REST API and provides the ability to query it. Internal class instantiated as part of the Client object.

add(source_id, resources, validate=True)[source]

Add one or more resources to a Managed Source

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourceresourceadd

Parameters:
  • source_id (str) – target Source ID
  • resources (array of dict) – An array of the source-specific resources that you’re adding.
  • validate (bool) – Allows you to suppress the validation of the resource, defaults to true.
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

remove(source_id, resource_ids)[source]

Remove one or more resources from a Managed Source

Uses API documented at http://dev.datasift.com/docs/api/rest-api/endpoints/sourceresourceremove

Parameters:
  • source_id (str) – target Source ID
  • resources (array of str) – An array of the resource IDs that you would like to remove..
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.pylon module

PYLON REST API.

class datasift.pylon.Pylon(request)[source]

Bases: object

Represents the DataSift Pylon API and provides the ability to query it. Internal class instantiated as part of the Client object.

analyze(id, parameters, filter=None, start=None, end=None, service='facebook')[source]

Analyze the recorded data for a given id

Parameters:
  • id (str) – The id of the recording
  • parameters (dict) – To set settings such as threshold and target
  • filter (str) – An optional secondary filter
  • start (int) – Determines time period of the analyze
  • end (int) – Determines time period of the analyze
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

compile(csdl, service='facebook')[source]

Compile the given CSDL

Parameters:
  • csdl (str) – The CSDL to be compiled for analysis
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

get(id, service='facebook')[source]

Get the existing analysis for a given hash

Parameters:
  • service (str) – The service for this API call (facebook, etc)
  • id (str) – The id of the recording to get
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

list(page=None, per_page=None, order_by='created_at', order_dir='DESC', service='facebook')[source]

List pylon recordings

Parameters:
  • page (int) – page number for pagination
  • per_page (int) – number of items per page, default 20
  • order_by (str) – field to order by, default request_time
  • order_dir (str) – direction to order by, asc or desc, default desc
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

sample(id, count=None, start=None, end=None, filter=None, service='facebook')[source]

Get sample interactions for a given recording

Parameters:
  • id (str) – The hash to get tag analysis for
  • start (int) – Determines time period of the sample data
  • end (int) – Determines time period of the sample data
  • filter (str) – An optional secondary filter
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

start(hash=None, name=None, id=None, service='facebook')[source]

Start a recording for the provided hash

Parameters:
  • hash (str) – The hash to start recording with
  • name (str) – The name of the recording
  • id (str) – The id of the recording
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

stop(id, service='facebook')[source]

Stop the recording for the provided id :param id: The hash to start recording with :type id: str :param service: The service for this API call (facebook, etc) :type service: str :rtype: DictResponse :raises: DataSiftApiException,

requests.exceptions.HTTPError
tags(id, service='facebook')[source]

Get the existing analysis for a given hash

Parameters:
  • id (str) – The id of the recording to get tag analysis for
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

update(id, hash=None, name=None, service='facebook')[source]

Update an existing recording with a new filter hash :param id: The id of the recording :type id: str :param hash: The hash to update the recording with :type hash: str :param name: The new name of the recording :type name: str :param service: The service for this API call (facebook, etc) :type service: str :return: dict of REST API output with headers attached :rtype: DictResponse :raises: DataSiftApiException,

requests.exceptions.HTTPError
validate(csdl, service='facebook')[source]

Validate the given CSDL

Parameters:
  • csdl (str) – The CSDL to be validated for analysis
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.odp module

Open Data Processing (ODP) REST API.

class datasift.odp.Odp(request)[source]

Bases: object

Represents the DataSift Open Data Processing REST API and provides the ability to query it. Internal class instantiated as part of the Client object.

batch(source_id, data)[source]

Upload data to the given soruce

Parameters:
  • source_id (str) – The ID of the source to upload to
  • data (list) – The data to upload to the source
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError, BadRequest

Python Client Internals

datasift.output_mapper module

class datasift.output_mapper.OutputMapper(date_strings)[source]

Bases: object

date(d)[source]
float_handler(d)[source]
outputmap(data)[source]

Internal function used to traverse a data structure and map the contents onto python-friendly objects inplace.

This uses recursion, so try not to pass in anything that’s over 255 objects deep.

Parameters:
  • data (any) – data structure
  • prefix (str) – endpoint family, eg. sources, historics
  • endpoint (str) – endpoint being called on the API
Returns:

Nothing, edits inplace

datasift.request module

Thin wrapper around the requests library.

class datasift.request.DataSiftResponse(response, data)[source]

Bases: object

Base object wrapper for a response from the DataSift REST API

Variables:

raw – Raw response

Parameters:
  • response (requests.response) – HTTP response to wrap
  • data (list) – data to wrap
headers
Returns:HTTP Headers of the Response
Return type:dict
ratelimits
Returns:Rate Limit headers
Return type:dict
status_code
Returns:HTTP Status Code of the Response
Return type:int
class datasift.request.DatasiftAuth(user, key)[source]

Bases: object

Internal class to represent an authentication pair.

Variables:
  • user – Stored username
  • key – Stored API key
class datasift.request.DictResponse(response, data)[source]

Bases: datasift.request.DataSiftResponse, dict

Wrapper for a response from the DataSift REST API, can be accessed as a dict.

clear() → None. Remove all items from D.
copy() → a shallow copy of D
fromkeys(S[, v]) → New dict with keys from S and values equal to v.

v defaults to None.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
has_key(k) → True if D has a key k, else False
headers
Returns:HTTP Headers of the Response
Return type:dict
items() → list of D's (key, value) pairs, as 2-tuples
iteritems() → an iterator over the (key, value) items of D
iterkeys() → an iterator over the keys of D
itervalues() → an iterator over the values of D
keys() → list of D's keys
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

ratelimits
Returns:Rate Limit headers
Return type:dict
setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
status_code
Returns:HTTP Status Code of the Response
Return type:int
update([E, ]**F) → None. Update D from dict/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

values() → list of D's values
viewitems() → a set-like object providing a view on D's items
viewkeys() → a set-like object providing a view on D's keys
viewvalues() → an object providing a view on D's values
class datasift.request.IngestRequest(auth, outputmapper, prefix=None, ssl=True, headers=None, timeout=None, proxies=None, verify=True, session=<requests.sessions.Session object>, async=False)[source]

Bases: datasift.request.PartialRequest

API_HOST = 'in.datasift.com/'
API_SCHEME = 'https'
API_VERSION = ''
CONTENT_TYPE = 'application/json'
HEADERS = (('User-Agent', 'DataSift/v1.4 Python/2.10.0'), ('Content-Type', 'application/json'))
build_response(response, path=None, parser=<function json_decode_wrapper>, async=False)

Builds a List or Dict response object.

Wrapper for a response from the DataSift REST API, can be accessed as a list.

Parameters:
  • response (DictResponse) – HTTP response to wrap
  • parser (func) – optional parser to overload how the data is loaded
Raises:

DataSiftApiException, DataSiftApiFailure, AuthException, requests.exceptions.HTTPError, RateLimitException

delete(path, params=None, headers=None, data=None)
dicts(*dicts)
get(path, params=None, headers=None)
path(*args)
post(path, data=None, headers={'Content-Type': 'application/json'})
put(path, data=None, headers={'Content-Type': 'application/json'})
with_prefix(path, *args)
class datasift.request.ListResponse(response, data)[source]

Bases: datasift.request.DataSiftResponse, list

Wrapper for a response from the DataSift REST API, can be accessed as a list.

append()

L.append(object) – append object to end

count(value) → integer -- return number of occurrences of value
extend()

L.extend(iterable) – extend list by appending elements from the iterable

headers
Returns:HTTP Headers of the Response
Return type:dict
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

insert()

L.insert(index, object) – insert object before index

pop([index]) → item -- remove and return item at index (default last).

Raises IndexError if list is empty or index is out of range.

ratelimits
Returns:Rate Limit headers
Return type:dict
remove()

L.remove(value) – remove first occurrence of value. Raises ValueError if the value is not present.

reverse()

L.reverse() – reverse IN PLACE

sort()

L.sort(cmp=None, key=None, reverse=False) – stable sort IN PLACE; cmp(x, y) -> -1, 0, 1

status_code
Returns:HTTP Status Code of the Response
Return type:int
class datasift.request.PartialRequest(auth, outputmapper, prefix=None, ssl=True, headers=None, timeout=None, proxies=None, verify=True, session=<requests.sessions.Session object>, async=False)[source]

Bases: object

Internal class used to represent a yet-to-be-completed request

API_HOST = 'api.datasift.com'
API_SCHEME = 'https'
API_VERSION = 'v1.4'
CONTENT_TYPE = 'application/json'
HEADERS = (('User-Agent', 'DataSift/v1.4 Python/2.10.0'), ('Content-Type', 'application/json'))
build_response(response, path=None, parser=<function json_decode_wrapper>, async=False)[source]

Builds a List or Dict response object.

Wrapper for a response from the DataSift REST API, can be accessed as a list.

Parameters:
  • response (DictResponse) – HTTP response to wrap
  • parser (func) – optional parser to overload how the data is loaded
Raises:

DataSiftApiException, DataSiftApiFailure, AuthException, requests.exceptions.HTTPError, RateLimitException

delete(path, params=None, headers=None, data=None)[source]
dicts(*dicts)[source]
get(path, params=None, headers=None)[source]
path(*args)[source]
post(path, data=None, headers={'Content-Type': 'application/json'})[source]
put(path, data=None, headers={'Content-Type': 'application/json'})[source]
with_prefix(path, *args)[source]
datasift.request.json_decode_wrapper(headers, data)[source]

datasift.exceptions module

exception datasift.exceptions.AuthException[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.BadRequest[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.DataSiftApiException(response)[source]

Bases: datasift.exceptions.DataSiftException

Indicates that the DataSift REST API has returned an error.

The text of the error can be found in .message, while the specifics can be found in the response object stored in .response

eg.:

try:
    hash = client.compile("this csdl is not going to work")
except DataSiftApiException as e:
    print "Exception:", e.message
    print e.response.status_code, e.response.headers
exception datasift.exceptions.DataSiftApiFailure[source]

Bases: datasift.exceptions.DataSiftException

Indicates that information recieved from DataSift was not able to be understood.

This usually indicates an error in the DataSift API.

exception datasift.exceptions.DataSiftException[source]

Bases: exceptions.Exception

exception datasift.exceptions.DeleteRequired[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.HistoricSourcesRequired[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.NotFoundException[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.RateLimitException(response)[source]

Bases: datasift.exceptions.DataSiftException

Indicates that the request has been refused due to rate limiting.

exception datasift.exceptions.StreamNotConnected[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.StreamSubscriberNotStarted[source]

Bases: datasift.exceptions.DataSiftException

exception datasift.exceptions.Unauthorized[source]

Bases: datasift.exceptions.DataSiftException

datasift.live_stream module

class datasift.live_stream.LiveStream[source]

Bases: autobahn.twisted.websocket.WebSocketClientProtocol

Internal class used to call the websocket callbacks.

onClose(wasClean, code, reason)[source]
onMessage(msg, binary)[source]
onOpen()[source]
onPing(payload)[source]
queueMessage(message)[source]
sending = False
sendqueue = []
class datasift.live_stream.LiveStreamFactory(*args, **kwargs)[source]

Bases: twisted.internet.protocol.ReconnectingClientFactory, autobahn.twisted.websocket.WebSocketClientFactory

Internal class used to implement the WebSocketClientFactory used in Client with reconnection

clientConnectionFailed(connector, reason)[source]
clientConnectionLost(connector, reason)[source]
delay = 1
maxDelay = 320
protocol

alias of LiveStream

startedConnecting(connector)[source]

datasift.pylon module

class datasift.pylon.Pylon(request)[source]

Bases: object

Represents the DataSift Pylon API and provides the ability to query it. Internal class instantiated as part of the Client object.

analyze(id, parameters, filter=None, start=None, end=None, service='facebook')[source]

Analyze the recorded data for a given id

Parameters:
  • id (str) – The id of the recording
  • parameters (dict) – To set settings such as threshold and target
  • filter (str) – An optional secondary filter
  • start (int) – Determines time period of the analyze
  • end (int) – Determines time period of the analyze
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

compile(csdl, service='facebook')[source]

Compile the given CSDL

Parameters:
  • csdl (str) – The CSDL to be compiled for analysis
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

get(id, service='facebook')[source]

Get the existing analysis for a given hash

Parameters:
  • service (str) – The service for this API call (facebook, etc)
  • id (str) – The id of the recording to get
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

list(page=None, per_page=None, order_by='created_at', order_dir='DESC', service='facebook')[source]

List pylon recordings

Parameters:
  • page (int) – page number for pagination
  • per_page (int) – number of items per page, default 20
  • order_by (str) – field to order by, default request_time
  • order_dir (str) – direction to order by, asc or desc, default desc
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

sample(id, count=None, start=None, end=None, filter=None, service='facebook')[source]

Get sample interactions for a given recording

Parameters:
  • id (str) – The hash to get tag analysis for
  • start (int) – Determines time period of the sample data
  • end (int) – Determines time period of the sample data
  • filter (str) – An optional secondary filter
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

start(hash=None, name=None, id=None, service='facebook')[source]

Start a recording for the provided hash

Parameters:
  • hash (str) – The hash to start recording with
  • name (str) – The name of the recording
  • id (str) – The id of the recording
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

stop(id, service='facebook')[source]

Stop the recording for the provided id :param id: The hash to start recording with :type id: str :param service: The service for this API call (facebook, etc) :type service: str :rtype: DictResponse :raises: DataSiftApiException,

requests.exceptions.HTTPError
tags(id, service='facebook')[source]

Get the existing analysis for a given hash

Parameters:
  • id (str) – The id of the recording to get tag analysis for
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

update(id, hash=None, name=None, service='facebook')[source]

Update an existing recording with a new filter hash :param id: The id of the recording :type id: str :param hash: The hash to update the recording with :type hash: str :param name: The new name of the recording :type name: str :param service: The service for this API call (facebook, etc) :type service: str :return: dict of REST API output with headers attached :rtype: DictResponse :raises: DataSiftApiException,

requests.exceptions.HTTPError
validate(csdl, service='facebook')[source]

Validate the given CSDL

Parameters:
  • csdl (str) – The CSDL to be validated for analysis
  • service (str) – The service for this API call (facebook, etc)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.pylon_task module

class datasift.pylon_task.PylonTask(request)[source]

Bases: object

Represents the DataSift Pylon task API and provides the ability to query it. Internal class instantiated as part of the Client object.

create(subscription_id, name, parameters, type='analysis', service='facebook')[source]

Create a PYLON task

Parameters:
  • subscription_id (str) – The ID of the recording to create the task for
  • name (str) – The name of the new task
  • parameters (dict) – The parameters for this task
  • type (str) – The type of analysis to create, currently only ‘analysis’ is accepted
  • service (str) – The PYLON service (facebook)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

get(id, service='facebook')[source]

Get a given Pylon task

Parameters:
  • id (str) – The ID of the task
  • service (str) – The PYLON service (facebook)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

list(per_page=None, page=None, status=None, service='facebook')[source]

Get a list of Pylon tasks

Parameters:
  • per_page (int) – How many tasks to display per page
  • page (string) – Which page of tasks to display
  • status – The status of the tasks to list
  • service (str) – The PYLON service (facebook)
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.identity module

class datasift.identity.Identity(request)[source]

Bases: object

Represents the identity API and provides the ability to query it. Internal class instantiated as part of the Client object.

create(label, status=None, master=None)[source]

Create an Identity

Parameters:
  • label – The label to give this new identity
  • status – The status of this identity. Default: ‘active’
  • master – Represents whether this identity is a master. Default: False
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

delete(id)[source]

Delete an Identity

Parameters:label – The label to give this new identity
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
get(id)[source]

Get the identity ID

Parameters:identity_id – The ID of the identity to retrieve
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError
list(label=None, per_page=20, page=1)[source]

Get a list of identities that have been created

Parameters:
  • per_page (int) – The number of results per page returned
  • page (int) – The page number of the results
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

update(id, label=None, status=None, master=None)[source]

Update an Identity

Parameters:
  • label – The label to give this new identity
  • status – The status of this identity. Default: ‘active’
  • master – Represents whether this identity is a master. Default: False
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

datasift.token module

class datasift.token.Token(request)[source]

Bases: object

Represents the DataSift token API and provides the ability to query it. Internal class instantiated as part of the Client object.

create(identity_id, service, token)[source]

Create the token

Parameters:
  • identity_id – The ID of the identity to retrieve
  • service – The service that the token is linked to
  • token – The token provided by the the service
  • expires_at – Set an expiry for this token
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

delete(identity_id, service)[source]

Delete the token

Parameters:
  • identity_id – The ID of the identity to retrieve
  • service – The service that the token is linked to
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

get(identity_id, service)[source]

Get a token for a specific identity and service

Parameters:
  • identity_id – The ID of the identity to retrieve
  • service – The service that the token is linked to
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

list(identity_id, per_page=20, page=1)[source]

Get a list of tokens

Parameters:
  • identity_id – The ID of the identity to retrieve tokens for
  • per_page – The number of results per page returned
  • page – The page number of the results
Returns:

dict of REST API output with headers attached

Return type:

DictResponse

Raises:

DataSiftApiException, requests.exceptions.HTTPError

update(identity_id, service, token=None)[source]

Update the token

Parameters:identity_id – The ID of the identity to retrieve
Returns:dict of REST API output with headers attached
Return type:DictResponse
Raises:DataSiftApiException, requests.exceptions.HTTPError