Tips & Tricks

How to choose metrics for your machine learning projects

December 6, 2024

Abstract Geometric backgrounds . Abstract mosaic background.Generative AI — Image source: 123RF

One of the great challenges product managers face when working on machine learning projects is bridging the gap between ML metrics and business metrics. On the one hand, your team has to track the performance of the model with metrics such as accuracy, AUC, and F1 score. On the other hand, you must find ways to measure the business impact of the model. While data scientists focus on improving model performance, the PM’s job is to ensure these improvements translate into tangible business results.

The key difference is simple: model metrics tell you how well your model performs technically, while business metrics show the actual value it creates. For example, a model with 99% accuracy might still fail to deliver business value if it’s solving the wrong problem.

Here’s how to choose the right business metrics:

1. Start with clear business goals

Define what you’re trying to achieve. Are you reducing fraud? Increasing sales? Improving user experience? Be specific about the impact you want to have on the business. For example, in an e-commerce platform, the business goal can be “increase average order value.” In a financial service, your business goal might be “reduce fraudulent transactions.”

2. Define your intervention

What kind of action do you want to take to achieve your goal? For example, to reduce fraud, your target action is blocking suspicious transactions. To increase the order value, you might decide to recommend five items that are relevant to the user’s shopping cart.

3. Map model impact to business outcomes

Determine how your model directly influences business goals. An ML model can help achieve the fraud detection goal by predicting which transactions are fraudulent. A recommendation system can increase average order value by predicting which items the user is most likely to purchase.

4. Choose measurable business metrics

Select metrics that directly tie to your goals. For fraud detection, track “dollars saved from prevented fraud.” For recommendations, measure “Incremental revenue from recommended items.” Note that your business metric might affect the metric used to track the model’s accuracy.

5. Consider constraints

Don’t optimize one metric at the expense of other important factors. For example, a fraud detection system that catches all fraud but blocks too many legitimate transactions isn’t successful. These constraints become your secondary metrics to monitor.

Remember: The best business metrics are specific, measurable, and directly tied to business value. They should tell a clear story about how your ML model impacts the bottom line.

Practical example: Customer churn prediction

Say you’re running an online service and you want to create an AI-powered customer churn prediction system for your company. The business goal you want to achieve is to reduce the number of customers who unsubscribe from your service after being with you for at least one year.

After talking to your sales team, you decide that the best intervention is to reach out to customers who are likely to churn and give them a discount on their next purchase. According to the sales team, you need to reach out at least two months before the customer churns to have enough time to turn them around.

Machine learning can help achieve this goal by predicting which customers will churn. But the model will have to match your business goals.

Your target business metric is “percent of customers churned after one year.” Your ML model’s metric will be “probability of churn within the next two months.” Your model will only be trained and applied to customers who have been on the service for at least a year (note how the business goal affects the model metric and training).

Now, you have to think about constraints. In this case, every time you give a discount, you are losing revenue. You should keep track of “revenue lost from discounts” and make sure that they do not exceed the value gained from retaining customers. For example, if you start giving out discounts aggressively, you might end up offering discounts to customers who would not have churned, all of which will be lost revenue. The balance between the main metric and the constraint can help you adjust your ML metrics.

After a few simulations with your historical data, you decide to set the threshold for your ML metric (probability of churn within the next two months) to 80% to make sure that your churn prediction system generates enough value.

As you can see in this example, applied machine learning has a lot of nuances that need to be considered. Make sure to consider both the technical and business aspects when choosing metrics.

If you’re interested in learning to create LLM applications, GoPractice has a fantastic GenAI Simulator course that gives you the perfect framework to think about generative AI and what kinds of problems you can solve with it. If you want to learn more about ML product management in general, you can try their broader AI/ML Simulator for PM course. I highly recommend both courses.

If you want to develop an LLM application for your organization but don’t know where to start from, contact me.

Are we at the cusp of a new era for artificial…

What to know about o3 and o4-mini, OpenAI’s new reasoning models

GPT-4.1: OpenAI’s most confusing model

Making data work for you: Challenges, innovations, and lessons learned

Under the hood: The Innovations powering DeepSeek’s AI breakthrough

How to turbocharge your product and market research with DeepSearch

How looking differently at data can save your machine learning project

Building a solid data foundation for generative AI applications

These mistakes can ruin your machine learning project