Welcome to DataSift Python Client Library’s documentation!

Contents:

The official DataSift API library for Python. This module provides access to the REST API and also facilitates consuming streams.

Requires Python 2.4+.

To use, ‘import datasift’ and create a datasift.User object passing in your username and API key. See the examples folder for reference usage.

Source Code:

https://github.com/datasift/datasift-python

Examples:

https://github.com/datasift/datasift-python/tree/master/examples

DataSift Platform Documentation:

http://dev.datasift.com/docs/

Copyright (C) 2012 MediaSift Ltd. All Rights Reserved.

This software is Open Source. Read the license:

https://github.com/datasift/datasift-python/blob/master/LICENSE

exception datasift.APIError[source]

Thrown for errors that occur while talking to the DataSift API.

exception datasift.AccessDeniedError[source]

This exception is thrown when an access denied error is returned by the DataSift API.

class datasift.ApiClient[source]

The default class used for accessing the DataSift API.

call(username, api_key, endpoint, params={}, user_agent='DataSiftPython/0.5.6')[source]

Make a call to a DataSift API endpoint.

exception datasift.CompileFailedError[source]

Thrown when compilation of a definition fails.

class datasift.Definition(user, csdl='', hash=False)[source]

A Definition instance represents a stream definition.

clear_hash()[source]

Reset the hash to false. The effect of this is to mark the definition as requiring compilation. Also resets other variables that depend on the CSDL.

compile()[source]

Call the DataSift API to compile this definition. If compilation succeeds we store the details in the response.

create_historic(start, end, sources, sample, name)[source]

Create a historic query based on this definition.

get()[source]

Get the definition’s CSDL string.

get_buffered(count=False, from_id=False)[source]

Call the DataSift API to get buffered interactions.

get_consumer(event_handler, consumer_type='http')[source]

Returns a StreamConsumer-derived object for this definition for the given type.

get_created_at()[source]

Returns the date when the stream was first created. If the created at date has not yet been obtained it validates the definition first.

get_dpu_breakdown()[source]

Call the DataSift API to get the DPU breakdown for this definition.

get_hash()[source]

Returns the hash for this definition. If the hash has not yet been obtained it compiles the definition first.

get_total_dpu()[source]

Returns the total DPU of the stream. If the DPU has not yet been obtained it validates the definition first.

set(csdl)[source]

Set the definition string.

validate()[source]

Call the DataSift API to validate this definition. If validation succeeds we store the details in the response.

class datasift.Historic(user, hash, start=None, end=None, sources=None, sample=None, name=None)[source]

A Historic instance represents a historic query.

delete()[source]

Delete this historic query.

get_availability()[source]

Get the data availability info. If the query has not yet been prepared this will be done automagically to obtain the availability data.

get_created_at()[source]

Returns the created_at date for this query.

get_dpus()[source]

Get the DPU cost. If the query has not yet been prepared this will be done automagically to obtain the cost.

get_end_date()[source]

Returns the end date for this query.

get_hash()[source]

Get the playback ID for this query. If the query has not yet been prepared this will be done automagically to get the hash.

get_name()[source]

Returns the friendly name of this query.

get_progress()[source]

Returns the progress percentage of this query.

get_sample()[source]

Returns the sample percentage of this query.

get_sources()[source]

Returns the sources for this query.

get_start_date()[source]

Returns the start date for this query.

get_status()[source]

Returns the status of this query.

get_stream_hash()[source]

Get the hash for the stream this Historics query is using.

static list(user, page=1, per_page=20)[source]

Get a page of Historics queries in the given user’s account, where each page contains up to per_page items.

prepare()[source]

Call the DataSift API to prepare this historic query.

set_name(name)[source]

Set the friendly name for this query.

start()[source]

Start this historic query.

stop()[source]

Stop this historic query.

exception datasift.InvalidDataError[source]

Thrown whenever invalid data is detected.

class datasift.PushDefinition(user)[source]

A PushDefinition instance represents a push endpoint configuration.

get_initial_status()[source]

Get the initial status for subscriptions.

get_output_param(key)[source]

Get an output parameter.

get_output_params()[source]

Get all of the output parameters.

get_output_type()[source]

Get the output type.

set_initial_status(status)[source]

Set the initial status for subscriptions.

set_output_param(key, val)[source]

Set an output parameter.

set_output_type(output_type)[source]

Set the output type.

subscribe(hash_type, hash, name)[source]

Subscribe this endpoint to a stream hash or historic playback ID. Note that this will activate the subscription if the initial status is set to active.

subscribe_definition(definition, name)[source]

Subscribe this endpoint to a Definition.

subscribe_historic(historic, name)[source]

Subscribe this endpoint to a Historic.

subscribe_historic_playback_id(playback_id, name)[source]

Subscribe this endpoint to a historic playback ID.

subscribe_stream_hash(hash, name)[source]

Subscribe this endpoint to a stream hash.

validate()[source]

Validate the output type and parameters with the DataSift API.

class datasift.PushSubscription(user, data)[source]

A PushSubscription instance represents the subscription of a push endpoint either a stream hash or a historic playback ID.

delete()[source]

Delete this subscription.

static get(user, id)[source]

Get a push subscription by ID.

get_created_at()[source]

Get the timestamp when this subscription was created.

get_hash()[source]

Get the hash or playback ID to which this subscription is subscribed.

get_hash_type()[source]

Get the hash type to which this subscription is subscribed.

get_id()[source]

Return the subscription ID.

get_last_request()[source]

Get the timestamp of the last push request.

get_last_success()[source]

Get the timestamp of the last successful push request.

get_log(page=1, per_page=20, order_by=False, order_dir=False)[source]

Get a page of the log for this subscription in the order specified.

static get_logs(user, page=1, per_page=20, order_by=False, order_dir=False, id=False)[source]

Page through recent push subscription log entries, specifying the sort order.

get_name()[source]

Return the subscription name.

get_status()[source]

Get the current status of this subscription. Make sure you call reload to get the latest data for this subscription first.

is_deleted()[source]

Returns True if this subscription has been deleted.

static list(user, page=1, per_page=20, order_by=False, order_dir=False, include_finished=False, hash_type=False, hash=False)[source]

Get a page of push subscriptions in the given user’s account, where each page contains up to per_page items. Results will be ordered according to the supplied ordering parameters.

static list_by_playback_id(user, playback_id, page=1, per_page=20, order_by=False, order_dir=False)[source]

Get a page of push subscriptions in the given user’s account subscribed to the given playback ID, where each page contains up to per_page items. Results will be ordered according to the supplied ordering parameters.

static list_by_stream_hash(user, hash, page=1, per_page=20, order_by=False, order_dir=False)[source]

Get a page of push subscriptions in the given user’s account subscribed to the given stream hash, where each page contains up to per_page items. Results will be ordered according to the supplied ordering parameters.

pause()[source]

Pause this subscription.

reload()[source]

Re-fetch this subscription from the API.

resume()[source]

Resume this subscription.

save()[source]

Save changes to the name and output parameters of this subscription.

set_output_param(key, val)[source]

Set an output parameter. Checks to see if the subscription has been deleted, and if not calls the base class to set the parameter.

stop()[source]

Stop this subscription.

exception datasift.RateLimitExceededError[source]

Thrown when you exceed the API rate limit.

class datasift.StreamConsumer(user, definition, event_handler)[source]

This is the base class for all protocol-specific StreamConsumer classes.

STATE_STOPPING = 3

Class variables

TYPE_HTTP = 'http'

Possible states.

consume(auto_reconnect=True)[source]

Start consuming.

static factory(user, consumer_type, definition, event_handler)[source]

Factory method for creating protocol-specific StreamConsumer objects.

stop()[source]

Stop the consumer.

class datasift.StreamConsumerEventHandler[source]

A base class for implementing event handlers for StreamConsumers.

exception datasift.StreamError[source]

Thrown for errors to do with the streaming API.

class datasift.User(username, api_key, use_ssl=True, stream_base_url='stream.datasift.com/')[source]

A User instance represents a DataSift user and provides access to all of the API functionality.

call_api(endpoint, params)[source]

Make a call to a DataSift API endpoint.

create_definition(csdl='')[source]

Create a definition object for this user. If a CSDL parameter is provided then this will be used as the initial CSDL for the definition.

create_historic(hash, start, end, sources, sample, name)[source]

Create a historic query based on this definition.

create_push_definition()[source]

Create a new Push definition for this user.

enable_ssl(use_ssl)[source]

Set whether stream connections should use SSL.

get_api_key()[source]

Get the API key.

get_consumer(hash, event_handler, consumer_type='http')[source]

Get a StreamConsumer object for the given hash via the given consumer type.

get_historic(playback_id)[source]

Get an existing Historics query from the API.

get_multi_consumer(hashes, event_handler, consumer_type='http')[source]

Get a StreamConsumer object for the given set of hashes via the given consumer type.

get_push_subscription(subscription_id)[source]

Get a Push subscription from the API.

get_push_subscription_log(subscription_id=False)[source]

Get the logs for all Push subscriptions or the given subscription.

get_rate_limit()[source]

Get the rate limit returned by the last API call, or -1 if no API calls have been made since this object was created.

get_rate_limit_remaining()[source]

Get the rate limit remaining as returned by the last API call, or -1 if no API calls have been made since this object was created.

get_usage(period='hour')[source]

Get usage data for this user.

get_useragent()[source]

Get the useragent to be used for all API requests.

get_username()[source]

Get the username.

list_historics(page=1, per_page=20)[source]

Get the Historics queries in your account.

list_push_subscriptions(page=1, per_page=20, order_by=False, order_dir=False, include_finished=False, hash_type=False, hash=False)[source]

Get the Push subscriptions in your account.

set_api_client(api_client)[source]

Set the object to be used as the API client. This must be a subclass of the default API client class.

use_ssl()[source]

Returns true if stream connections should be using SSL.

Indices and tables

Table Of Contents

This Page