RevenueBase Blog

How We Fixed Job Title Classification at Scale

The problem with job titles has always been context. We just solved it.

Job title classification is one of those problems that seems simple until you actually try to do it. Most B2B data providers are still using the same basic approaches from years ago: keyword matching, lookup tables, maybe some regex patterns. It works fine until you realize a CMO in healthcare is a Chief Medical Officer has nothing to do with marketing.

Most job title classification systems operate on a simple assumption: titles are universal. They build massive lookup tables, apply some pattern matching, and call it a day.

Here’s what they miss:

The Context Problem

Example 1: The CMO Paradox

  • Technology Industry: Chief Marketing Officer, runs demand gen, owns the marketing department
  • Healthcare Industry: Chief Medical Officer, manages physician staff, runs clinical operations
  • Manufacturing Industry: Chief Marketing Officer, traditional marketing role

Traditional systems default to marketing every time. That’s a massive data quality issue for anyone selling into healthcare.

Example 2: The Geography Gap

  • “Geschäftsführer” (Germany): Managing Director or CEO equivalent
  • “Directeur Général” (France): General Manager or CEO depending on context
  • “总经理” (China): General Manager, often at VP or C-level

Most systems either ignore non-English titles entirely or run them through Google Translate and hope for the best.

The Industry Blind Spot

A “Principal” at a tech company is likely a senior engineer. At a school? That’s the head administrator. At an investment firm? Senior investor. Yet most classification systems pick one interpretation and run with it globally.

Our Approach: Context-Aware, Multi-Dimensional Classification

Instead of building bigger lookup tables, we built a system that actually reasons about job titles. Here’s the actual prompt architecture powering our classification:

Input Context:
- Job title
- LinkedIn job description  
- LinkedIn industry (weighted heavily)
- Country of employment

Output Dimensions:
1. JOB_FUNCTION (21 categories)
2. JOB_LEVEL (6 tiers)
3. PERSONA (25 buyer archetypes)

The key innovation? We don’t just classify. We triangulate. By considering industry context and geographic norms, we can correctly identify what a role actually does versus what the title might suggest.

Real-World Classification Examples

Let’s look at how this plays out with actual titles:

The Multilingual Challenge

Title: “Chef de Projet Digital” (France)

  • Old System: Often miscategorized as “Chef” (culinary) or just ignored
  • Our System: Digital Project Manager. Operations. Manager level. Product Manager persona.

Title: “リードエンジニア” (Japan, Lead Engineer)

  • Old System: “Other” or missed entirely
  • Our System: Engineering. Manager/Director level. Software Engineer persona.

The Ambiguous Acronym

Title: “CDO”

  • Retail Industry: Chief Digital Officer. IT department. C-Team. CIO/CTO Executive persona.
  • Healthcare Industry: Chief Dental Officer. Health and Human Services department. C-Team. Other persona.
  • Data/Analytics Industry: Chief Data Officer. IT department. C-Team. Data & Analytics persona.

Our system makes these distinctions automatically by weighing the LinkedIn industry context. No manual rules needed.

The Geographic Variation

Title: “Managing Director”

  • United States + Finance Industry: Senior leadership, often heads a division
  • United Kingdom + Any Industry: Often equivalent to CEO
  • Germany + Manufacturing: CEO equivalent (Geschäftsführer translated)

The combination of country and industry tells us what the title actually means in context.

What Makes This Different

Traditional classification approaches fail in predictable ways:

Keyword Matching: Sees “Chief” and assumes C-suite. Misses that “Chief of Staff” isn’t an executive role.

Lookup Tables: Work great for common English titles. Fall apart with “Leiter Vertrieb” (German for Head of Sales) or emerging titles like “Head of Revenue Operations.”

Rule-Based Systems: Hundreds of if/then statements that break the moment someone creates a new title. Remember when “Growth Hacker” became a thing?

Our approach uses contextual reasoning. Every classification considers:

  • LinkedIn industry context (healthcare vs tech vs finance)
  • Geographic norms (what titles mean in different countries)
  • Linkedin job descriptions when available

Why This Matters for GTM Teams

If you’re building on RevenueBase data, this impacts everything:

Better ICPs: You and your users can actually differentiate between technical and business buyers now, even when their titles overlap.

Global Expansion: Launch in Germany without spending months figuring out that “Prokurist” is basically a VP-level role with signing authority.

Reduced Waste: Your SDRs stop calling Chief Medical Officers about marketing automation software.

Smarter Routing: Route based on actual buyer personas, not guesswork about what a title means.

The Technical Implementation

Here’s what actually happens when we classify a title:

  1. Multi-dimensional Input: We pull job title, LinkedIn job description, LinkedIn industry, and country
  2. Industry Weighting: The system heavily weights industry context (CMO + healthcare = medical, not marketing)
  3. Three-axis Output: Every role gets Function, Level, and Persona classifications
  4. Confidence Scoring: If we can’t classify with confidence, we say so (rather than forcing a bad match)

This isn’t revolutionary ML architecture. It’s just doing the obvious thing that nobody else bothered to do: considering context.

The Industry Factor

LinkedIn industry classification becomes incredibly powerful when you use it right. Consider how the same title shifts meaning:

“VP Operations”

  • Hospital & Health Care: Clinical operations, patient flow, medical staff coordination
  • Internet: Site reliability, infrastructure, technical operations
  • Logistics and Supply Chain: Warehouse management, distribution, fleet operations

Without industry context, you’re just guessing. With it, you know exactly who you’re targeting.

Looking Forward

Job titles are getting weirder. “Head of Remote” is a real title now. So is “Chief Meme Officer” (seriously, look it up). The old approaches of pattern matching and lookup tables were always going to hit a wall.

By moving to contextual classification that considers industry and geography, we can handle whatever creative titles companies dream up next. A “Wizard of Light Bulb Moments” at a creative agency? We’ll figure out they’re in creative services. A “Digital Overlord” at a tech startup? Probably IT leadership.

The point isn’t that we’ve solved every edge case. It’s that we’ve built a system that can actually reason about titles instead of just matching patterns.

For GTM teams using RevenueBase data, this means cleaner data, better targeting, and fewer embarrassing mistargeted outreach campaigns.

Because at the end of the day, knowing who you’re actually talking to is kind of important.


Want to explore how our classification system handles your specific use cases? Your RevenueBase account team can walk you through the classification logic for your target industries.

CATEGORIES