Two datasets. One decomposition. We asked OpenAI's gpt-4o-mini to recommend the top real estate brands in 20 Australian suburbs with web search switched off: 800 model calls, ten iterations per suburb, 5 cities, no internet. We then matched the same suburb set against four web-grounded engines (Gemini, Grok, OpenAI, Perplexity) on 21 April. The two layers tell different stories. Ray White is the only brand that holds both, entrenched in every city we tested. LJ Hooker is the cleanest case of the opposite pattern: the model "remembers" the brand in 4 of 5 cities, but the search layer cannot surface it through to citation. McGrath in Sydney scored 0.86 on memory and 0.29 on retrieval. The loudest parametric signal in our entire dataset, and not enough to win the citation race.
This article is not a ranking of "best" Australian real estate brands. It is a decomposition of what AI surfaces, split into two layers a brand can act on independently. Belle Property, McGrath, Jellis Craig, Barry Plant, Place, Harcourts, Harris, Acton, Professionals: each one sits at a different combination of memory and retrieval. The shape of that combination tells you which lever to pull.
Last updated: April 2026
- One brand holds both layers in every city: Ray White.
- LJ Hooker is the cross-city memory-favored signal. Present in 4 of 5 cities at the memory layer; below the grounding threshold in Sydney, Perth, Adelaide, and Brisbane.
- Brisbane is the most concentrated market in our sample. Five brands have meaningful AI presence; Ray White holds both layers at 0.82 and 0.93.
- Sydney has the thinnest entrenched layer. McGrath, Belle Property, and LJ Hooker all clear memory but not grounding.
- Only two brand-city pairs are retrieval-led: Acton in Perth (0.05/0.31) and Professionals in Adelaide (0.06/0.34). One algorithm change from disappearing.
- Hockingstuart still surfaces in Melbourne's parametric layer (0.26 bias) despite merging with Belle Property in 2019. The brand persists in the joint network; the parametric memory of the standalone name is louder than current grounding can match.
What we did
Two datasets, two layers, one decomposition.
We tested 20 suburbs across 5 Australian cities (Sydney, Melbourne, Brisbane, Perth, Adelaide), with one prestige, one inner, one middle, and one outer suburb per city. Read every per-city statement in this article as "across the four suburbs we tested in [city]," not "in [city]." That caveat is doing a lot of work.
The parametric probe ran on 26 April 2026. We asked OpenAI's gpt-4o-mini to recommend the top real estate brands in each of the 20 suburbs, with web search disabled. Each suburb was probed 10 times at temperature 0.8, across 2 prompt families. 800 model calls total. Cost: $0.131. Output: every brand the model surfaced from training data alone, with no retrieval allowed. This is not "ChatGPT bias." This is one model's parametric layer. Larger OpenAI models (gpt-4o, o3) may have different priors.
The grounding audit ran on 21 April 2026. We sent the same suburb set across four web-grounded AI engines (ChatGPT with web search, Gemini, Grok, Perplexity). 160 prompts; 152 usable responses (Gemini returned 8 errors and 1 refusal across 40 prompts). Output: every brand each engine surfaced when retrieval was enabled, with cited URLs. The grounding-side dataset is the parent of our cross-city Australian study. Individual agents (Mack Hall, Ted Pye, Nick Tang, Aaron Woolard, Ingrid Bradshaw, Nigel Ross) were excluded from this brand decomposition because they are people, not brands; our entity-footprint pilot covered them separately.
For every brand named across either dataset, we computed two scores. Bias score: the fraction of probe iterations naming the brand, from 0.00 to 1.00. Grounding score: the fraction of audit engines (out of 4) naming the brand for that suburb, weighted by whether the mention came with a cited URL. Wilson 95% confidence intervals on both layers. A minimum-suburb-spread filter (the brand must appear in at least 2 of 4 sampled suburbs) prevents single-suburb concentration from inflating cross-city scores.
One day. Two methods. 80+ brand names. Run your own brand against the same probe before changing your strategy.
- 4 suburbs per city. Each city verdict reflects one prestige, one inner, one middle, and one outer suburb. Read every city statement as "across the four suburbs we tested," not "in [city]."
- The parametric layer is gpt-4o-mini only. Larger OpenAI models may have different parametric priors. We have not tested them yet.
- Grounding layer = 4 engines on 21 April 2026. Gemini returned 8 errors and 1 refusal across 40 prompts; its grounding contribution is partial.
- Wilson 95% CIs computed. With n_audit at roughly 30 per city, the CI on a 0.30 grounding score is approximately ±0.15. Borderline calls (lower CI bound below threshold) are directional rather than confirmed.
- Individuals filter applied. Named individual agents are excluded; the entity-footprint pilot handles them.
- Min-suburb spread of 2 enforced. A brand must appear in at least 2 of the 4 sampled suburbs to qualify for any non-invisible quadrant.
Why parametric memory and grounded retrieval are different levers
Parametric memory and grounded retrieval are two distinct citation pathways, and most "AEO" advice only addresses one of them.
Dan Petrovic at dejan.ai published the Selection Rate and Primary Bias frameworks that this study extends. SR (the rate at which a brand gets cited) is shaped by two layers we measure separately in this study: (model believes you exist) × (retrieval surfaces you). The first is parametric. It is set by training data, RLHF, and brand mentions across the high-trust web. By external estimates it moves on roughly 3-6 month cycles as foundation models retrain and refresh; OpenAI does not publish exact cadence. The second is the SEO-adjacent retrieval layer. It moves week to week with Google's and Bing's ranking changes.
Different lever entirely.
Parametric memory is downstream of brand presence in places models read during training. Wikipedia, news archives, third-party authority sites, Wikidata, structured industry data. A brand that appears in those places at scale gets baked into the model's prior. When the user asks "top real estate brands in Surry Hills" with no internet, the model surfaces what its training corpus already knows. McGrath Sydney is the limit case. 0.86 bias score across 4/4 sampled suburbs (Mosman, Surry Hills, Epping, Penrith). The highest parametric score in our entire dataset. That number measures gpt-4o-mini's training-baked memory of the McGrath brand, not McGrath's current grounding performance, which sits at 0.29 across the same four-suburb set.
Grounded retrieval is downstream of search-surface visibility on the day of the query. A page that ranks for "real estate agents Surry Hills" on 21 April gets pulled into ChatGPT's web-search context, into Gemini's grounding pipeline, into Perplexity's source set, into Grok's web answer. It also gets cited. Brands that publish suburb-specific content, maintain fresh listing pages, and earn third-party mentions in the same week clear this layer. Brands that rely on parametric reputation alone do not.
Most AEO advice we read addresses only the second half. Schema markup. Structured FAQ. Topical clustering. Crawlability. All of that affects the grounding layer. None of it touches the memory layer. Brand-mention work (press, Wikipedia, third-party authority) is the lever that moves the memory layer, and it moves on training cycles, not on weekly SEO sprints.
Most AEO advice only addresses one half of the equation. The half that moves on training cycles is where the long game lives.
The four-quadrant map
The four-quadrant map sorts every brand into one of four positions based on whether each layer clears the 0.30 threshold. Both clear: entrenched. Memory clears, retrieval doesn't: memory-favored. Retrieval clears, memory doesn't: retrieval-led. Neither clears (or the brand only appears in one of four suburbs): invisible. We picked 0.30 because it sits above the noise floor for both layers in our dataset. An appendix analysis at 0.20 widens the active set without changing the headline brand assignments.
Different position, different roadmap.
The lever you pull to move out of memory-favored is brand-mention work: Wikipedia presence, press coverage, third-party authority, structured data on the open web. That is the work that shows up in the next training cycle, not in next week's rankings. The lever you pull to move out of retrieval-led is the inverse: SEO, freshness, suburb-page content depth, schema markup. Different roadmaps, different cadences, different teams inside an organisation. Treat them as one ("AEO") and you misallocate budget.
| Quadrant | What it means | The lever to pull |
|---|---|---|
| Entrenched | Both layers clear the threshold | Hardest to displace; coexistence is the only short-term play |
| Memory-favored | Parametric memory clears, retrieval doesn't | Grounding work: SEO, fresh content, structured data, suburb-specific pages |
| Retrieval-led | Retrieval clears, parametric memory doesn't | Brand-mention work: Wikipedia, press, third-party authority |
| Invisible | Neither layer clears, or brand appears in only 1 of 4 suburbs | Below detection in this 4-suburb sample |
Across our 21 active brand-city pairs at the 0.30 threshold: 10 entrenched, 9 memory-favored, 2 retrieval-led, 94 invisible. The shape of that distribution is the shape of the Australian AI brand market we measured. Most brands are below threshold in most cities. A small set of incumbents holds both layers. The interesting middle (memory-favored) is where the strategic work lives.
Per city: which Australian brands sit in which quadrant
Five cities, four suburbs each, one map per city. The per-city patterns differ enough that a single national strategy does not fit.
Sydney: only Ray White holds both layers
Sydney returned 24 brands across the four suburbs we tested. One entrenched (Ray White at 0.74 bias and 0.61 grounding, present in 4 of 4 suburbs). Three memory-favored. Zero retrieval-led. Twenty invisible.
The thinnest entrenched layer of any city in our sample. McGrath sits at 0.86 bias and 0.29 grounding (4/4 suburbs), the loudest parametric signal in our entire dataset, but the search layer cannot surface it through to citation. Belle Property runs 0.66 / 0.26 (4/4). LJ Hooker runs 0.61 / 0.13 (4/4). All three are loud in memory and quiet in grounding. The pattern is unusually clean: across the four suburbs we tested in Sydney, three major brands are the cleanest example of memory-favored we found. The Sydney spoke covers the agent-level layer for the same suburb set.
Across the four suburbs we tested in Sydney: a memory-rich, retrieval-poor market for everyone except Ray White.
Melbourne: three entrenched brands, no memory-favored
Melbourne returned 33 brands across the four suburbs we tested. Three entrenched, zero memory-favored, zero retrieval-led, thirty invisible.
| Brand | Bias | Grounding | Quadrant |
|---|---|---|---|
| Jellis Craig | 0.53 | 0.39 | Entrenched |
| Ray White | 0.59 | 0.32 | Entrenched |
| Barry Plant | 0.50 | 0.39 | Entrenched |
| Marshall White | 0.25 | 0.21 | Below threshold (directional) |
| Nelson Alexander | 0.25 | 0.21 | Below threshold (directional) |
| Hockingstuart | 0.26 | 0.00 | Below threshold (ghost) |
The most diversified entrenched layer in our sample. Three distinct brands hold both layers. Marshall White and Nelson Alexander sit just below threshold (0.25/0.21 for both); we treat these as directional rather than confirmed. The Melbourne spoke drills into the agent-level signals beneath these brands. Hockingstuart is the structurally interesting case. The brand merged with Belle Property in 2019 and now operates within that network, but gpt-4o-mini still surfaces "Hockingstuart" from training data at 0.26 bias across 3 of 4 suburbs we tested. Grounding scores it at 0.00. Memory carries the old standalone name forward longer than the search index does.
Brisbane: the most concentrated AI brand market in our sample
Brisbane returned 18 brands across the four suburbs we tested. Four entrenched, one memory-favored, zero retrieval-led, thirteen invisible. The most concentrated AU market we tested.
| Brand | Bias | Grounding | Quadrant |
|---|---|---|---|
| Ray White | 0.82 | 0.93 | Entrenched |
| McGrath | 0.57 | 0.41 | Entrenched |
| Place | 0.38 | 0.41 | Entrenched |
| Harcourts | 0.45 | 0.31 | Entrenched |
| LJ Hooker | 0.41 | 0.28 | Memory-favored |
Five brands have meaningful AI presence across the four suburbs we tested in Brisbane. Ray White dominates both layers at 0.82 and 0.93, the highest combined score in the entire study. Place is the most distinctive Brisbane-native brand in the entrenched set. LJ Hooker holds the memory layer (0.41) but not grounding (0.28), repeating a pattern we see in 4 of 5 cities.
Perth: where retrieval-led visibility shows up
Perth returned 26 brands across the four suburbs we tested. One entrenched, two memory-favored, one retrieval-led, twenty-two invisible. The only city in our sample with all three non-invisible quadrants populated. The Perth spoke covers the agent-level signals.
| Brand | Bias | Grounding | Quadrant |
|---|---|---|---|
| Ray White | 0.62 | 0.47 | Entrenched |
| LJ Hooker | 0.56 | 0.16 | Memory-favored |
| Harcourts | 0.46 | 0.25 | Memory-favored |
| Acton | 0.05 | 0.31 | Retrieval-led |
Acton (0.05/0.31, present in 3 of 4 suburbs) is the cleanest retrieval-led case in the dataset. A brand the model does not "know" but the search layer surfaces. The grounding-side win is real on the day of the query and exposed the day Google or ChatGPT changes how it ranks suburb pages. Diversifying into the memory layer (Wikipedia, press, third-party authority) is the only way to harden that position.
Adelaide: a memory-heavy market with one retrieval-led winner
Adelaide returned 14 brands across the four suburbs we tested. The smallest brand pool of any city in our sample. One entrenched, three memory-favored, one retrieval-led, nine invisible.
| Brand | Bias | Grounding | Quadrant |
|---|---|---|---|
| Ray White | 0.78 | 0.56 | Entrenched |
| LJ Hooker | 0.54 | 0.09 | Memory-favored |
| Harcourts | 0.45 | 0.16 | Memory-favored |
| Harris | 0.31 | 0.00 | Memory-favored |
| Professionals | 0.06 | 0.34 | Retrieval-led |
Three of fourteen brands sit in the memory-favored quadrant. The highest fraction of any city in our sample. Harris is the local Adelaide brand in the memory set; gpt-4o-mini has it at 0.31 bias but the grounding layer scores it at 0.00 across the four suburbs we tested. Professionals is the retrieval-led mirror image of Harris: invisible in the memory layer (0.06), present in the grounding layer (0.34). Different lever, different roadmap.
Cross-city: what the data agrees on
Across all five cities we tested, three patterns hold.
One: Ray White is universal. The brand clears both layers in every single city in our sample. Not "leading" in every city. Not "the highest scorer" in every city. Entrenched, by our definition, in all five.
Want to see where your brand sits? Run the same 4-engine probe on your own suburb →
Two: LJ Hooker repeats the same memory-favored shape in 4 of our 5 cities. Brisbane is the closest to crossing into entrenched (0.41 bias, 0.28 grounding). Perth holds 0.56 bias against 0.16 grounding. Adelaide reads 0.54 against 0.09. Sydney shows 0.61 against 0.13. Memory clears the 0.30 threshold in four cities. Grounding clears in none. We read this as structural rather than coincidental: parametric footprint built across decades of network presence, retrieval surface that does not match it at the suburb level.
Three: retrieval-led brands are rare. Only two brand-city pairs in the entire dataset cleared the grounding threshold without parametric support. Acton in Perth. Professionals in Adelaide. One algorithm change away from invisibility, because the grounding-only position has no memory cushion underneath it.
What this means for Australian real estate brands
Three takeaways come out of this map sharp enough to act on this quarter.
1. Identify which layer is failing for your brand-city pair. Memory-favored means the model knows you but the citation race goes elsewhere; the lever is grounding work (SEO, freshness, structured data, suburb-specific page depth). Retrieval-led means the search layer surfaces you but the model does not "remember" you between queries; the lever is brand-mention work (Wikipedia, press coverage, third-party authority that future training runs ingest). Acton in Perth (0.05/0.31) and Professionals in Adelaide (0.06/0.34) are the only retrieval-led brand-city pairs in our 21 active set: one ranking change at Google or in the OpenAI web index drops either below threshold. Different lever, different roadmap, different team. Treating both as one programme misallocates budget.
2. Memory-favored is fixable, but slowly. Parametric memory updates roughly every 3-6 months as foundation models retrain (industry estimate, not OpenAI policy). Brand-mention work commissioned today shows up in citations one or two training cycles later. The investment is a year-long bet. McGrath Sydney's 0.86 bias score is a decade of McGrath presence in the high-trust web (press, industry directories, Wikipedia mentions). That score is durable, even when the grounding layer disappoints at 0.29.
3. Probe your own brand-city pairs before you change anything. The free instant check at /check tests the grounding layer across 4 engines. The parametric layer requires a 10-iteration no-search probe per suburb; 800 model calls covers a 5-city map at $0.13 in gpt-4o-mini API cost. The methodology sits in section 1; we run probes for clients on request.
What we don't know yet
Four suburbs per city is a sample, not a population. A larger sample (eight or twelve suburbs per city) would tighten the Wilson confidence intervals and might surface brands that sit just below the spread filter at four. The borderline cases (Marshall White, Nelson Alexander, Hockingstuart) would either move into the active set or stay below; we cannot say which from this dataset alone.
The parametric layer is gpt-4o-mini only. We have not run the no-search probe against gpt-4o, o3, Claude, or the Gemini family. The 800 model calls in this study are gpt-4o-mini calls. McGrath Sydney's 0.86 bias score is a measurement of gpt-4o-mini's parametric priors for the four Sydney suburbs we tested (Mosman, Surry Hills, Epping, Penrith). Larger OpenAI reasoning models may rank McGrath higher, lower, or differently. We will know once we run them.
The grounding side is one day of data: 21 April 2026. The same audit run in July or October will move with whatever the search engines rank that month. Quadrant membership rotates. The memory-favored versus retrieval-led split itself looks durable across that rotation, because the underlying mechanisms (training cadence vs SEO cadence) are different timescales. The exact brand-city pairs in each quadrant will not stay fixed.
The framework held for the 21 active brand-city pairs in this AU real estate sample. Whether the bias × grounding decomposition transfers cleanly to dental, legal, or hospitality niches in AU (or to AU real estate outside this 5-city sample) is untested. We have a transfer study queued.
FAQ
What is the difference between an AI's parametric memory and grounded retrieval for real estate brands?
Parametric memory is what the model "knows" from training data: brand mentions in Wikipedia, news archives, and the high-trust web baked into model weights. By industry estimates it updates on roughly 3-6 month cycles as models retrain. Grounded retrieval is what the model finds when web search is enabled at query time: pages that rank for the suburb on the day. It moves week to week with Google's and Bing's ranking changes. A brand can clear one layer and not the other, which is why the four-quadrant map exists. McGrath Sydney is the textbook split: 0.86 bias, 0.29 grounding.
Which Australian real estate brand does AI recommend most often?
Ray White. Entrenched in 5 of 5 cities we tested. Brisbane scores were 0.82 bias and 0.93 grounding, the highest combined in our entire dataset.
Why does ChatGPT name LJ Hooker even when its current ranking signal is weak?
LJ Hooker is the cleanest cross-city memory-favored brand in our dataset. Present at the parametric layer in Sydney (0.61 bias), Perth (0.56), Adelaide (0.54), and Brisbane (0.41). Below the grounding threshold in all four. The pattern fits a brand with a large historical footprint in places models read during training (press archives, industry directories, Wikipedia mentions) where the parametric prior has been reinforced over many years. The search-surface layer has not kept pace. So gpt-4o-mini reaches for the brand from memory whenever the user asks "top real estate brands in [Sydney suburb]" with no internet, but the grounded engines (Gemini, Grok, OpenAI with web search, Perplexity) do not surface enough current LJ Hooker pages to clear the citation threshold. Two layers, two trajectories. We see the result as memory-favored.
How many Australian real estate brands are entrenched in AI?
Ten brand-city pairs at the 0.30 threshold across our 5-city sample. Ray White holds five of those slots (one per city). The other five: McGrath in Brisbane (0.57/0.41), Jellis Craig in Melbourne (0.53/0.39), Barry Plant in Melbourne (0.50/0.39), Place in Brisbane (0.38/0.41), and Harcourts in Brisbane (0.45/0.31).
Did we test ChatGPT's larger model?
No. The parametric probe was gpt-4o-mini only.
Can a real estate brand move from memory-favored to entrenched?
Yes, but the work is slow. The parametric layer updates roughly every 3-6 months as models retrain (industry estimate), so brand-mention work today shows up in citations one or two training cycles later. The grounding layer is the faster lever: SEO, fresh content, structured data, suburb-specific page depth. Memory-favored brands already have the memory layer; their work is grounding. The reverse path (retrieval-led to entrenched) is harder because building parametric presence requires sustained third-party authority over years.
Sources
- Cited Research, Bias-vs-Grounding Map of Australian Real Estate, 26 April 2026. Probe: 800 OpenAI gpt-4o-mini calls without web search across 20 AU suburbs, 10 iterations per suburb. Audit: 160 prompts across 4 engines (Gemini, Grok, OpenAI, Perplexity) on 21 April 2026; 152 usable responses after Gemini errors and refusals.
- Cited Research, Australian Real Estate Agent AI Study, 21 April 2026. Parent dataset for the grounding layer. 4 engines × 20 suburbs × 2 prompt families = 160 responses; 152 usable.
- Petrovic, D. Selection Rate and Primary Bias frameworks. dejan.ai research, 2026. The two-layer decomposition presented in this article extends Petrovic's published work to AU real estate brands.
- Cited Research, Entity-Footprint Pilot Audit, 21 April 2026. Sister study covering individual agent signals (review depth, suburb specialisation).
- Australian Bureau of Statistics, suburb classifications used for tier assignment (prestige / inner / middle / outer).
We built Cited because no one was measuring what AI engines actually recommend. Our methodology is public, our data is first-party, and we practise on ourselves before we advise clients.
More about Cited →
By