Python
Storage#
Cloud Storage in 10 seconds#
Install the library#
The source code for the library (and demo code) lives on GitHub, You can install the library quickly with pip:
$ pip install gcloud
Run the demo#
In order to run the demo, you need to have registred an actual gcloud project and so you’ll need to provide some environment variables to facilitate authentication to your project:
- GCLOUD_TESTS_PROJECT_ID: Developers Console project ID (e.g. bamboo-shift-455).
- GCLOUD_TESTS_DATASET_ID: The name of the dataset your tests connect to. This is typically the same as GCLOUD_TESTS_PROJECT_ID.
- GCLOUD_TESTS_CLIENT_EMAIL: The email for the service account you’re authenticating with
- GCLOUD_TESTS_KEY_FILE: The path to an encrypted key file. See private key docs for explanation on how to get a private key.
Run the example script included in the package:
$ python -m gcloud.storage.demo
And that’s it! You should be walking through a demonstration of using gcloud.storage to read and write data to Google Cloud Storage.
Try it yourself#
You can interact with a demo dataset in a Python interactive shell.
Start by importing the demo module and instantiating the demo connection:
>>> from gcloud.storage import demo
>>> connection = demo.get_connection()
Once you have the connection, you can create buckets and keys:
>>> connection.get_all_buckets()
[<Bucket: ...>, ...]
>>> bucket = connection.create_bucket('my-new-bucket')
>>> print bucket
<Bucket: my-new-bucket>
>>> key = bucket.new_key('my-test-file.txt')
>>> print key
<Key: my-new-bucket, my-test-file.txt>
>>> key = key.set_contents_from_string('this is test content!')
>>> print key.get_contents_as_string()
'this is test content!'
>>> print bucket.get_all_keys()
[<Key: my-new-bucket, my-test-file.txt>]
>>> key.delete()
>>> bucket.delete()
Note
The get_connection method is just a shortcut for:
>>> from gcloud import storage
>>> from gcloud.storage import demo
>>> connection = storage.get_connection(
>>> demo.PROJECT_NAME, demo.CLIENT_EMAIL, demo.PRIVATE_KEY_PATH)
gcloud.storage#
Shortcut methods for getting set up with Google Cloud Storage.
You’ll typically use these to get started with the API:
>>> import gcloud.storage
>>> bucket = gcloud.storage.get_bucket('bucket-id-here',
'project-id',
'long-email@googleapis.com',
'/path/to/private.key')
>>> # Then do other things...
>>> key = bucket.get_key('/remote/path/to/file.txt')
>>> print key.get_contents_as_string()
>>> key.set_contents_from_string('New contents!')
>>> bucket.upload_file('/remote/path/storage.txt', '/local/path.txt')
The main concepts with this API are:
- gcloud.storage.connection.Connection which represents a connection between your machine and the Cloud Storage API.
- gcloud.storage.bucket.Bucket which represents a particular bucket (akin to a mounted disk on a computer).
- gcloud.storage.key.Key which represents a pointer to a particular entity in Cloud Storage (akin to a file path on a remote machine).
- gcloud.storage.__init__.get_bucket(bucket_name, project, client_email, private_key_path)[source]#
Shortcut method to establish a connection to a particular bucket.
You’ll generally use this as the first call to working with the API:
>>> from gcloud import storage >>> bucket = storage.get_bucket(project, bucket_name, email, key_path) >>> # Now you can do things with the bucket. >>> bucket.exists('/path/to/file.txt') False
Parameters: - bucket_name (string) – The id of the bucket you want to use. This is akin to a disk name on a file system.
- project (string) – The name of the project to connect to.
- client_email (string) – The e-mail attached to the service account.
- private_key_path (string) – The path to a private key file (this file was given to you when you created the service account).
Return type: Returns: A bucket with a connection using the provided credentials.
- gcloud.storage.__init__.get_connection(project, client_email, private_key_path)[source]#
Shortcut method to establish a connection to Cloud Storage.
Use this if you are going to access several buckets with the same set of credentials:
>>> from gcloud import storage >>> connection = storage.get_connection(project, email, key_path) >>> bucket1 = connection.get_bucket('bucket1') >>> bucket2 = connection.get_bucket('bucket2')
Parameters: - project (string) – The name of the project to connect to.
- client_email (string) – The e-mail attached to the service account.
- private_key_path (string) – The path to a private key file (this file was given to you when you created the service account).
Return type: Returns: A connection defined with the proper credentials.
Connections#
Create / interact with gcloud storage connections.
- class gcloud.storage.connection.Connection(project, *args, **kwargs)[source]#
Bases: gcloud.connection.Connection
A connection to Google Cloud Storage via the JSON REST API.
This class should understand only the basic types (and protobufs) in method arguments, however should be capable of returning advanced types.
See gcloud.connection.Connection for a full list of parameters. Connection differs only in needing a project name (which you specify when creating a project in the Cloud Console).
A typical use of this is to operate on gcloud.storage.bucket.Bucket objects:
>>> from gcloud import storage >>> connection = storage.get_connection(project, email, key_path) >>> bucket = connection.create_bucket('my-bucket-name')
You can then delete this bucket:
>>> bucket.delete() >>> # or >>> connection.delete_bucket(bucket)
If you want to access an existing bucket:
>>> bucket = connection.get_bucket('my-bucket-name')
A Connection is actually iterable and will return the gcloud.storage.bucket.Bucket objects inside the project:
>>> for bucket in connection: >>> print bucket <Bucket: my-bucket-name>
In that same way, you can check for whether a bucket exists inside the project using Python’s in operator:
>>> print 'my-bucket-name' in connection True
Parameters: project (string) – The project name to connect to. - API_ACCESS_ENDPOINT = 'https://storage.googleapis.com'#
- API_URL_TEMPLATE = '{api_base_url}/storage/{api_version}{path}'#
A template for the URL of a particular API call.
- API_VERSION = 'v1'#
The version of the API, used in building the API call’s URL.
- api_request(method, path, query_params=None, data=None, content_type=None, api_base_url=None, api_version=None, expect_json=True)[source]#
Make a request over the HTTP transport to the Cloud Storage API.
You shouldn’t need to use this method, but if you plan to interact with the API using these primitives, this is the correct one to use...
Parameters: - method (string) – The HTTP method name (ie, GET, POST, etc). Required.
- path (string) – The path to the resource (ie, '/b/bucket-name'). Required.
- query_params (dict) – A dictionary of keys and values to insert into the query string of the URL. Default is empty dict.
- data (string) – The data to send as the body of the request. Default is the empty string.
- content_type (string) – The proper MIME type of the data provided. Default is None.
- api_base_url (string) – The base URL for the API endpoint. Typically you won’t have to provide this. Default is the standard API base URL.
- api_version (string) – The version of the API to call. Typically you shouldn’t provide this and instead use the default for the library. Default is the latest API version supported by gcloud-python.
- expect_json (bool) – If True, this method will try to parse the response as JSON and raise an exception if that cannot be done. Default is True.
Raises: Exception if the response code is not 200 OK.
- build_api_url(path, query_params=None, api_base_url=None, api_version=None)[source]#
Construct an API url given a few components, some optional.
Typically, you shouldn’t need to use this method.
Parameters: - path (string) – The path to the resource (ie, '/b/bucket-name').
- query_params (dict) – A dictionary of keys and values to insert into the query string of the URL.
- api_base_url (string) – The base URL for the API endpoint. Typically you won’t have to provide this.
- api_version (string) – The version of the API to call. Typically you shouldn’t provide this and instead use the default for the library.
Return type: string
Returns: The URL assembled from the pieces provided.
- create_bucket(bucket)[source]#
Create a new bucket.
For example:
>>> from gcloud import storage >>> connection = storage.get_connection(project, client, key_path) >>> bucket = connection.create_bucket('my-bucket') >>> print bucket <Bucket: my-bucket>
Parameters: bucket (string or gcloud.storage.bucket.Bucket) – The bucket name (or bucket object) to create. Return type: gcloud.storage.bucket.Bucket Returns: The newly created bucket. Raises: gcloud.storage.exceptions.Conflict if there is a confict (bucket already exists, invalid name, etc.)
- delete_bucket(bucket, force=False)[source]#
Delete a bucket.
You can use this method to delete a bucket by name, or to delete a bucket object:
>>> from gcloud import storage >>> connection = storage.get_connection(project, email, key_path) >>> connection.delete_bucket('my-bucket') True
You can also delete pass in the bucket object:
>>> bucket = connection.get_bucket('other-bucket') >>> connection.delete_bucket(bucket) True
If the bucket doesn’t exist, this will raise a gcloud.storage.exceptions.NotFound:
>>> from gcloud.storage import exceptions >>> try: >>> connection.delete_bucket('my-bucket') >>> except exceptions.NotFound: >>> print 'That bucket does not exist!'
Parameters: - bucket (string or gcloud.storage.bucket.Bucket) – The bucket name (or bucket object) to create.
- full – If True, empties the bucket’s objects then deletes it.
Return type: bool
Returns: True if the bucket was deleted.
Raises: gcloud.storage.exceptions.NotFound if the bucket doesn’t exist, or gcloud.storage.exceptions.Conflict if the bucket has keys and force is not passed.
- generate_signed_url(resource, expiration, method='GET', content_md5=None, content_type=None)[source]#
Generate signed URL to provide query-string auth’n to a resource.
Parameters: - resource (string) – A pointer to a specific resource (typically, /bucket-name/path/to/key.txt).
- expiration (int, long, datetime.datetime, datetime.timedelta) – When the signed URL should expire.
- method (string) – The HTTP verb that will be used when requesting the URL.
- content_md5 (string) – The MD5 hash of the object referenced by resource.
- content_type (string) – The content type of the object referenced by resource.
Return type: string
Returns: A signed URL you can use to access the resource until expiration.
- get_all_buckets()[source]#
Get all buckets in the project.
This will not populate the list of keys available in each bucket.
You can also iterate over the connection object, so these two operations are identical:
>>> from gcloud import storage >>> connection = storage.get_connection(project, email, key_path) >>> for bucket in connection.get_all_buckets(): >>> print bucket >>> # ... is the same as ... >>> for bucket in connection: >>> print bucket
Return type: list of gcloud.storage.bucket.Bucket objects. Returns: All buckets belonging to this project.
- get_bucket(bucket_name)[source]#
Get a bucket by name.
If the bucket isn’t found, this will raise a gcloud.storage.exceptions.NotFound. If you would rather get a bucket by name, and return None if the bucket isn’t found (like {}.get('...')) then use Connection.lookup().
For example:
>>> from gcloud import storage >>> from gcloud.storage import exceptions >>> connection = storage.get_connection(project, email, key_path) >>> try: >>> bucket = connection.get_bucket('my-bucket') >>> except exceptions.NotFound: >>> print 'Sorry, that bucket does not exist!'
Parameters: bucket_name (string) – The name of the bucket to get. Return type: gcloud.storage.bucket.Bucket Returns: The bucket matching the name provided. Raises: gcloud.storage.exceptions.NotFound
- lookup(bucket_name)[source]#
Get a bucket by name, returning None if not found.
You can use this if you would rather checking for a None value than catching an exception:
>>> from gcloud import storage >>> connection = storage.get_connection(project, email, key_path) >>> bucket = connection.get_bucket('doesnt-exist') >>> print bucket None >>> bucket = connection.get_bucket('my-bucket') >>> print bucket <Bucket: my-bucket>
Parameters: bucket_name (string) – The name of the bucket to get. Return type: gcloud.storage.bucket.Bucket Returns: The bucket matching the name provided or None if not found.
- make_request(method, url, data=None, content_type=None, headers=None)[source]#
A low level method to send a request to the API.
Typically, you shouldn’t need to use this method.
Parameters: - method (string) – The HTTP method to use in the request.
- url (string) – The URL to send the request to.
- data (string) – The data to send as the body of the request.
- content_type (string) – The proper MIME type of the data provided.
- headers (dict) – A dictionary of HTTP headers to send with the request.
Return type: tuple of response (a dictionary of sorts) and content (a string).
Returns: The HTTP response object and the content of the response.
- new_bucket(bucket)[source]#
Factory method for creating a new (unsaved) bucket object.
This method is really useful when you’re not sure whether you have an actual gcloud.storage.bucket.Bucket object or just a name of a bucket. It always returns the object:
>>> bucket = connection.new_bucket('bucket') >>> print bucket <Bucket: bucket> >>> bucket = connection.new_bucket(bucket) >>> print bucket <Bucket: bucket>
Parameters: bucket (string or gcloud.storage.bucket.Bucket) – A name of a bucket or an existing Bucket object.
Iterators#
Iterators for paging through API responses.
These iterators simplify the process of paging through API responses where the response is a list of results with a nextPageToken.
To make an iterator work, just override the get_items_from_response method so that given a response (containing a page of results) it parses those results into an iterable of the actual objects you want:
class MyIterator(Iterator):
def get_items_from_response(self, response):
items = response.get('items', [])
for item in items:
yield MyItemClass.from_dict(item, other_arg=True)
You then can use this to get all the results from a resource:
>>> iterator = MyIterator(...)
>>> list(iterator) # Convert to a list (consumes all values).
Or you can walk your way through items and call off the search early if you find what you’re looking for (resulting in possibly fewer requests):
>>> for item in MyIterator(...):
>>> print item.name
>>> if not item.is_valid:
>>> break
- class gcloud.storage.iterator.Iterator(connection, path, extra_params=None)[source]#
Bases: object
A generic class for iterating through Cloud Storage list responses.
Parameters: - connection (gcloud.storage.connection.Connection) – The connection to use to make requests.
- path (string) – The path to query for the list of items.
- PAGE_TOKEN = 'pageToken'#
- RESERVED_PARAMS = frozenset(['pageToken'])#
- get_items_from_response(response)[source]#
Factory method called while iterating. This should be overriden.
This method should be overridden by a subclass. It should accept the API response of a request for the next page of items, and return a list (or other iterable) of items.
Typically this method will construct a Bucket or a Key from the page of results in the response.
Parameters: response (dict) – The response of asking for the next page of items. Return type: iterable Returns: Items that the iterator should yield.
- get_next_page_response()[source]#
Requests the next page from the path provided.
Return type: dict Returns: The parsed JSON response of the next page’s contents.
- get_query_params()[source]#
Getter for query parameters for the next request.
Return type: dict Returns: A dictionary of query parameters.
Exceptions#
Custom exceptions for gcloud.storage package.
See: https://cloud.google.com/storage/docs/json_api/v1/status-codes
- exception gcloud.storage.exceptions.BadRequest(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘400 Bad Request’ response.
- code = 400#
- exception gcloud.storage.exceptions.ClientError(message, errors=())[source]#
Bases: gcloud.storage.exceptions.StorageError
Base for 4xx responses
This class is abstract
- exception gcloud.storage.exceptions.Conflict(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘409 Conflict’ response.
- code = 409#
- exception gcloud.storage.exceptions.Forbidden(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘403 Forbidden’ response.
- code = 400#
- exception gcloud.storage.exceptions.InternalServerError(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ServerError
Exception mapping a ‘500 Internal Server Error’ response.
- code = 500#
- exception gcloud.storage.exceptions.LengthRequired(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘411 Length Required’ response.
- code = 411#
- exception gcloud.storage.exceptions.MethodNotAllowed(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘405 Method Not Allowed’ response.
- code = 405#
- exception gcloud.storage.exceptions.MovedPermanently(message, errors=())[source]#
Bases: gcloud.storage.exceptions.Redirection
Exception mapping a ‘301 Moved Permanently’ response.
- code = 301#
- exception gcloud.storage.exceptions.NotFound(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘404 Not Found’ response.
- code = 404#
- exception gcloud.storage.exceptions.NotImplemented(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ServerError
Exception mapping a ‘501 Not Implemented’ response.
- code = 501#
- exception gcloud.storage.exceptions.NotModified(message, errors=())[source]#
Bases: gcloud.storage.exceptions.Redirection
Exception mapping a ‘304 Not Modified’ response.
- code = 304#
- exception gcloud.storage.exceptions.PreconditionFailed(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘412 Precondition Failed’ response.
- code = 412#
- exception gcloud.storage.exceptions.Redirection(message, errors=())[source]#
Bases: gcloud.storage.exceptions.StorageError
Base for 3xx responses
This class is abstract.
- exception gcloud.storage.exceptions.RequestRangeNotSatisfiable(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘416 Request Range Not Satisfiable’ response.
- code = 416#
- exception gcloud.storage.exceptions.ResumeIncomplete(message, errors=())[source]#
Bases: gcloud.storage.exceptions.Redirection
Exception mapping a ‘308 Resume Incomplete’ response.
- code = 308#
- exception gcloud.storage.exceptions.ServerError(message, errors=())[source]#
Bases: gcloud.storage.exceptions.StorageError
Base for 5xx responses: (abstract)
Bases: gcloud.storage.exceptions.ServerError
Exception mapping a ‘503 Service Unavailable’ response.
- exception gcloud.storage.exceptions.StorageError(message, errors=())[source]#
Bases: exceptions.Exception
Base error class for gcloud errors (abstract).
Each subclass represents a single type of HTTP error response.
- code = None#
HTTP status code. Concrete subclasses must define.
- exception gcloud.storage.exceptions.TemporaryRedirect(message, errors=())[source]#
Bases: gcloud.storage.exceptions.Redirection
Exception mapping a ‘307 Temporary Redirect’ response.
- code = 307#
- exception gcloud.storage.exceptions.TooManyRequests(message, errors=())[source]#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘429 Too Many Requests’ response.
- code = 429#
Bases: gcloud.storage.exceptions.ClientError
Exception mapping a ‘401 Unauthorized’ response.
- gcloud.storage.exceptions.eklass#
alias of ServiceUnavailable
- gcloud.storage.exceptions.make_exception(response, content)[source]#
Factory: create exception based on HTTP response code.
Return type: instance of StorageError, or a concrete subclass.