The world of search is shifting from lists of ranked links to conversational answers generated by large language models (LLMs). Instead of using PageRank‑like algorithms, these systems decide which words to output based on probability distributions over tokens. A log probability (or logprob
) is simply the natural logarithm of the probability that a particular token will follow in a sequence. By examining these logprobs, marketers can see which brands or entities the model instinctively associates with a query and can take steps to make their own brand part of that shortlist.
Why Should Brands Care?
When ChatGPT, Gemini or similar models answer a question, they don’t consult a search index; they predict tokens. If your brand is one of the tokens with a high logprob in response to a relevant question, it is far more likely to appear in the answer. In other words, visibility in AI search is about being a probable word. Logprobs act as a diagnostic: they show what the model already knows about your brand and reveal whether your brand is even in the model’s consideration set.
Traditional SEO practices—keywords, meta tags, internal links—remain important, but AI search also rewards trust and authority. Think of your site’s reputation like a digital credit score: models assess whether the content is authoritative, up‑to‑date and supported by credible sources. Allowing AI crawlers to index your pages, producing consistent expert content and securing authoritative backlinks all help build that score.
llustrative Example: Project‑Management Tools Query
Suppose you operate “TaskFlow,” a new project‑management platform. A potential customer might ask an AI assistant, “What are the top project management tools for small teams?” To see whether TaskFlow appears in the model’s mental shortlist, you can use an API that returns logprobs for the next tokens.
Below is a Python pseudocode example (for OpenAI’s GPT‑4 or a similar model) that requests log probabilities for the first token after the query:
from openai import OpenAI client = OpenAI(api_key="YOUR_API_KEY") prompt = "Top project management tools for small teams include" response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": prompt}], max_tokens=1, temperature=0, logprobs=True, top_logprobs=5 ) chosen = response.choices[0].logprobs_result.chosen_candidates[0] top_cands = response.choices[0].logprobs_result.top_candidates[0].candidates print("Chosen token:", chosen.token) for cand in top_cands: print(cand.token, cand.logprob)
If the output shows tokens like " Asana"
, " Trello"
or " Monday"
with high logprobs and no mention of TaskFlow, that means the model doesn’t associate your brand with this context. You can then set out to change that perception.
Six Practical Tactics for Entity Alignment
-
Map priority queries and evaluate logprob gaps. Identify the questions your target audience asks (e.g., “best project‑management software for startups,” “affordable running shoes for marathons”). Run logprob tests against these prompts and record the top predicted entities. Note whether your brand appears and how confident the model is about each entity.
-
Produce expert, value‑driven content. AI systems value depth and authority. Instead of surface‑level posts, publish in‑depth guides, how‑to articles and case studies that link your brand directly to the query topics. Include synonyms and related terms surfaced by your logprob research to broaden semantic coverage.
-
Strengthen your trust signals. Models are more likely to mention brands that other credible sources cite. Build relationships with industry publications, participate in research projects and collaborate with respected partners. Aim for mentions in thought‑leader blogs, podcasts or academic papers that discuss the topic.
-
Open your site to AI crawlers. Update your robots.txt file to allow bots like
GPTBot
andBingbot
, and ensure your XML sitemaps list all important pages. If AI crawlers cannot access your content, the model cannot learn about your brand. -
Fact‑check and cite authoritative sources. Provide credible citations for statistics and claims; multiple sources strengthen trust. Avoid relying on secondary aggregator sites when citing data. Updating articles when new evidence emerges signals commitment to accuracy.
-
Monitor, iterate and blend with traditional SEO. Re‑run logprob tests regularly to track whether your brand moves into the list of top predicted entities. Don’t fret over small fluctuations—logprob values naturally vary due to stochastic sampling. At the same time, continue to build backlinks, improve page experience and structure your data; AI search engines still use these signals to assess quality and trustworthiness.
Content‑Planning Blueprint
To operationalise entity alignment, create a spreadsheet with the following columns:
Audience query | Top predicted entities & logprobs | Brand present? | Actions to align |
---|---|---|---|
Top project management tools for small teams | Asana (−0.4), Trello (−0.8), Monday (−1.0), Basecamp (−1.2) | No | Create a detailed comparison guide that includes TaskFlow; publish guest posts on productivity blogs; update robots.txt to allow AI crawlers. |
Best running shoes for marathons | Nike (−0.3), Adidas (−0.7), Hoka (−0.9), Brooks (−1.1) | No | Commission a review article comparing marathon shoes and highlight our brand; secure citations from sports physiologists. |
Fill in the table for each important query, then design content and outreach initiatives to shift the model’s perception. Over time, high‑quality content and credible mentions can help your brand appear among the top tokens.
Final Thoughts
Entity alignment is not about gaming the system; it’s about genuinely connecting your brand to the topics your audience cares about. Log probabilities offer a unique tool for auditing how AI models currently perceive your brand and for measuring the progress of your efforts. By pairing logprob insights with authority‑building practices and openness to AI crawlers, you can systematically raise your brand’s “digital credit score” and become a more likely candidate in AI‑generated answers.