Skip to content

Instantly share code, notes, and snippets.

@craigpatten
Last active August 29, 2015 14:17
Show Gist options
  • Select an option

  • Save craigpatten/45e7fe4381cfb40bfa55 to your computer and use it in GitHub Desktop.

Select an option

Save craigpatten/45e7fe4381cfb40bfa55 to your computer and use it in GitHub Desktop.
Hitachi Content Platform (HCP) - invalid HTTP responses for objects that are a multiple of 4GB in size.

The Hitachi Content Platform (HCP) provides an interface which pretends to be Amazon S3, so you can access it with packages such as boto. This is nice.

However, when attempting to access objects that are a multiple of 4GB in size, the HCP returns multiple, inconsistent Content-Length headers; one says zero, the other has the correct size. This is an RFC violation and will likely crash any application that isn't a little bit creative in way it parses these headers.

For example, when boto attempts a lookup on such an object, it throws an exception, which is an acceptable action because the invalid Content-Length renders the entire response invalid. The effect of this exception on the application or end-user entirely depends on how boto is used.

Request from boto:

HEAD /[...] HTTP/1.1
Host: [...]
Accept-Encoding: identity
Date: Wed, 18 Mar 2015 00:48:06 GMT
Content-Length: 0
Authorization: AWS [...]
User-Agent: Boto/2.36.0 Python/2.7.6 Darwin/14.1.0

Response from the HCP:

HTTP/1.1 200 OK
Date: Wed, 18 Mar 2015 00:48:06 GMT
Server: HCP V7.0.1.17H1004
ETag: "c9a5a6878d97b48cc965c1e41859f034"
Last-Modified: Mon, 23 Feb 2015 04:42:36 GMT
Content-Type: application/octet-stream
Content-Length: 4294967296
Content-Length: 0

Resultant boto stacktrace:

Traceback (most recent call last):
  File "./hcp-4gb.py", line 14, in <module>
    object = bucket.get_key("[...]")
  File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 192, in get_key
    key, resp = self._get_key_internal(key_name, headers, query_args_l)
  File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 216, in _get_key_internal
    k.size = int(response.getheader('content-length'))
ValueError: invalid literal for int() with base 10: '4294967296, 0'

This fault is present in HCP V7.0.1.17H1004, though it's probably trivial to fix.

#!/usr/bin/env python
import boto, base64, hashlib
from boto.s3.connection import S3Connection
server = "your-hcp-endpoint.acme.com"
hs3_id = base64.b64encode("your-hcp-username")
hs3_secret = hashlib.md5("your-hcp-password").hexdigest()
hs3 = S3Connection(aws_access_key_id = hs3_id, aws_secret_access_key = hs3_secret, host = server, debug = 2)
bucket = hs3.get_bucket("your-bucket-name")
print bucket.get_key("object-that-is-a-multiple-of-4GB-in-size")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment