Design usage events
Success with Metronome depends on the data you provide, so it's important to design usage events well. We recommend three design principles for designing the best usage events:
- Work backward from what you need
- Work forward from what you have
- Maximize your flexibility
This guide outlines a hypothetical scenario where you're a developer at a Content Delivery Network (CDN) company, and you've been tasked with integrating your system with Metronome to support usage-based billing.
Work backward from what you need
It's important to start with your existing requirements and work backward from there. In the case of your hypothetical CDN, your team charges customers based on their monthly data usage. However, the exact pricing details are unknown. This is fine as pricing can be applied and adjusted later as long as you have the required metrics in place.
An additional consideration is that your customer support team wants to take advantage of Metronome as a real-time data platform to notify customers when there's an unusual spike in traffic for their sites.
For both invoicing and notifications, you need to measure data transfer, so the bare minimum usage event looks like this:
{
"event_type": "transfer",
"properties": {
"bytes": "1234",
},
"transaction_id": "...",
"customer_id": "...",
"timestamp": "..."
}
This supports a billable metric like "sum of bytes
for all events of type transfer
(for a given customer, for a given billing period)." Next, consider what data you have available and how that data might help in the future.
Work forward from what you have
- When to send these events. Your system could track total data transfer internally and send a daily per-customer summary to Metronome. Or you could send Metronome an event every time a web page is served. Both options provide the same invoicing ability at the end of each month, but is a matter of preference depending on your needs.
- What data to include. There's a lot more information to potentially include in these usage events. For example, you could include which data center served the page, what domain was being hosted, the type of file, or even what URL was accessed. None of this is immediately necessary for invoicing, but it could be useful for your own records.
The timing and content of usage events are often heavily influenced by what is available in your existing system. At your hypothetical CDN company, imagine that you perform global log aggregation with Apache Flume. In this case, there's a central data store with detailed access logs available, making it easy to send those log messages to Metronome as they arrive in the form of individual usage events.
Now imagine that you don't have such global aggregation. Instead, each data center keeps its own independent log and sends hourly summaries back to the central data store, broken down by domain. You'd like to have each data center send usage events directly to Metronome, but unfortunately the code there doesn't have access to the customer database, so it can't determine what customer_id
to fill in for each event. In this case, the hourly summaries are probably the best option. From the central location, it's easy to look up the owner of each domain and provide the appropriate customer_id
.
Before deciding to send the hourly summaries, you check back with the customer support team about those traffic spike notifications they wanted. They assure you that the hourly cadence is fast enough for the notifications they want to send.
Now that you've settled on how often to send usage events, you still need to decide exactly what information to include.
Maximize your flexibility
Requirements and circumstances change. Instead of trying to predict all your future business needs, aim to maximize flexibility so changes are easy.
In Metronome, flexibility is maximized when you send as much data as possible. Metronome's stream pipeline can handle high event throughput, and irrelevant data is discarded during processing. This means there's no downside to sending information that isn't going to be used right away.
There is, however, a big upside to sending extra information. Suppose that your hypothetical CDN company starts getting feedback from customers that they don't understand their bills. Many customers are responsible for more than one domain and would like to be able to see how much each domain is contributing to their total usage. As the assigned developer, your executives ask you to fix this.
If your usage event didn't include the domain, you'd need to go back to your code and add it. But you chose to send as much data as possible, so it's already there:
{
"event_type": "transfer",
"properties": {
"domain": "www.example.com",
"data_center": "US-WEST-3",
"bytes": "12345789",
},
"transaction_id": "...",
"customer_id": "...",
"timestamp": "..."
}
All you need to do is query the Metronome API for usage data grouped by domain
. And your customers are happy with the new breakdown on their invoices.
As time passes and your customer base grows, the finance team discovers a worrisome problem. Your company has been billing customers based on their total data transfer, but your bandwidth costs are different in different parts of the world. In some cases, you're actually losing money by undercharging for data transfer in certain regions.
As before, if your usage events didn't include information about where the data transfer occurred, you'd have to go back into your code and add it. But because you already decided to send as much data as possible, there's a data_center
field that will work. Billable metrics in Metronome can filter usage events in a variety of ways, so you are able to use a mapping of data center names to regions to define a new billable metric for each region. Going forward, you can bill based on those new metrics, where you can set individual prices for each region.
Metronome operates on streams; changes you make only affect future data collection and aggregation. New billable metrics cannot be applied to historical data.