Comparison of Instance Metadata Services

Ahmet Alp Balkan
, on

Instance metadata service is a server available to virtual machines hosted on the cloud providers (often at http://169.254.169.254/). It provides useful information about the VM itself and its environment, which the VM typically does not have access to.

It is often used to configure and distinguish VM instances from each other in scripts and helps a great deal in bootstrapping cluster orchestrators such as Kubernetes, Mesos etc.

In a nutshell, the metadata server works like:

$ curl http://169.254.169.254/latest/meta-data/public-ipv4
54.165.97.141
$ curl http://169.254.169.254/latest/meta-data/instance-type
t2.micro

In this article, I looked at metadata service offerings of AWS EC2, Google Compute Engine and DigitalOcean to compare them. At the time of writing, Microsoft Azure does not provide a metadata service similar to these.

Table of Contents

  1. DigitalOcean: Highlights
  2. AWS EC2: Highlights
  3. Google Compute Engine: Highlights
  4. Feature Comparison Chart
  5. Performance benchmarks
  6. Conclusion

1. DigitalOcean: Highlights

Documentation: https://developers.digitalocean.com/…/metadata/

Although DigitalOcean is not a big player or a full blown cloud provider, their VPS offering is widely adopted and their lean approach to cloud instances (droplets) are very practical to use.

Good:

Bad:

2. AWS EC2: Highlights

Documentation: https://docs.aws.amazon.com/…/ec2-instance-metadata.html

Amazon Web Services was pretty much the first player in the cloud market, in fact they might as well be the ones who invented the whole concept of “instance metadata service” and the IP address 169.254.169.254.

Although it is very much the de-facto standard of metadata services, I found it not modern enough and it is not really dynamic.

Good:

Bad:

3. Google Compute Engine: Highlights

Documentation: https://cloud.google.com/compute/docs/metadata

Maybe it’s the advantage of being the last one joining the party, but GCE’s metadata service is just perfect. It provides a great deal of flexibility, it is very dynamic and yet still not rocket science.

Good:

Bad:

4. Feature Comparison Chart

Feature DO AWS GCE
cloud-init Yes Yes Yes
External IP Yes Yes Yes
SSH Public Keys Yes Yes Yes
Region/Zone Yes Yes Yes
Disks N/A Yes Yes
Machine type/size No Yes Yes
Dynamic custom metadata No No Yes
Watch for changes No No Yes
Security credentials No Yes Yes
JSON response format Yes Meh Yes
Ability to disable No Yes No

5.Performance Benchmarks

Metadata services are often meant to be used only once to bootstrap things or maybe a few times a day, so you don’t really care about performance. However, out of curiosity, I tested performance of these metadata services by sending 10,000 requests (100 requests in parallel) and see how they perform.

DigitalOcean has applied some form of throttling (should be based on an undocumented rate limit) in some test runs, but it often restored quickly afterwards.

$ ./boom -c 100 -n 10000 http://169.254.169.254/metadata/v1/id

Summary:
  Total:    4.1009 secs.
  Slowest:  0.2282 secs.
  Fastest:  0.0086 secs.
  Average:  0.0406 secs.
  Requests/sec: 2438.4929
  Total Data Received:  70000 bytes.
  Response Size per Request:    7 bytes.

Status code distribution:
  [200] 10000 responses

Response time histogram:
  0.009 [1]   |
  0.031 [2931]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.053 [5418]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.075 [1132]|∎∎∎∎∎∎∎∎
  0.096 [339] |∎∎
  0.118 [78]  |
  0.140 [6]   |
  0.162 [4]   |
  0.184 [20]  |
  0.206 [58]  |
  0.228 [13]  |

Google Compute Engine performs really well at this concurrency level. When I bump up the load and the concurrency, a long tail starts to show up and server gets slower, as expected. I observed no explicit throttling.

$ ./boom -c 100 -n 10000 -h 'X-Google-Metadata-Request:True' 'http://metadata.google.internal/computeMetadata/v1/instance/id'

Summary:
  Total:    1.7962 secs.
  Slowest:  0.2097 secs.
  Fastest:  0.0045 secs.
  Average:  0.0178 secs.
  Requests/sec: 5567.3540
  Total Data Received:  200000 bytes.
  Response Size per Request:    20 bytes.

Status code distribution:
  [200] 10000 responses

Response time histogram:
  0.005 [1]   |
  0.025 [9387]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.046 [610] |∎∎
  0.066 [0]   |
  0.087 [0]   |
  0.107 [0]   |
  0.128 [0]   |
  0.148 [0]   |
  0.169 [0]   |
  0.189 [0]   |
  0.210 [2]   |

AWS EC2 Instance Metadata Service has performed far worse than the others under load and frequently returns HTTP 409 Conflict responses. I managed to get a fully successful run once I lowered concurrency level to <10.

$ ./boom -c 100 -n 10000 http://169.254.169.254/latest/meta-data/instance-id

Summary:
  Total:    45.6048 secs.
  Slowest:  7.4325 secs.
  Fastest:  0.0006 secs.
  Average:  0.4474 secs.
  Requests/sec: 218.1568
  Total Data Received:  2859403 bytes.
  Response Size per Request:    287 bytes.

Status code distribution:
  [200] 2086 responses
  [429] 7863 responses

Response time histogram:
  0.001 [1]   |
  0.744 [6570]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  1.487 [3068]|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  2.230 [51]  |
  2.973 [8]   |
  3.717 [198] |∎
  4.460 [6]   |
  5.203 [0 ]  |
  5.946 [0]   |
  6.689 [2]   |
  7.433 [45]  |

Error distribution:
  [51]  Get http://169.254.169.254/latest/meta-data/instance-id: EOF

6. Conclusion

It’s clear that Google Compute Engine instance metadata service is well thought out and carefully designed. I can see it being potentially useful in many scenarios such as cluster bootstrapping.

AWS EC2 and DigitalOcean do not support custom metadata and they are not very much dynamic, so that has been a big turn off for me.

I appreciate any comments, discussion and possibly comparisons with other environments such as OpenStack Nova.


Update: Made several fixes to the article based on Alex Yukhanov’s comments.


If you liked this post, you can follow me on Twitter or subscribe by email to my blog (no more than an article/month).