Skip to main content

Identifiers

An identifier is a piece of data that may uniquely or partially identify a real-world entity. The engine accepts the following types:
TypeDescriptionFormat requirement
emailEmail addressRFC 5321
phonePhone numberE.164 (e.g. +15551234567)
usernamePlatform-agnostic usernameAny
ipIPv4 or IPv6 addressStandard notation
domainRegistered domain nameRFC 1035
nameFull or partial nameAny
urlWeb URL for scraping or analysisURL
addressPhysical or mailing addressAny
company_nameBusiness or organization nameAny
social_urlDirect URL to a social media profileURL
You can submit multiple identifiers in a single request. The engine treats them as a combined signal and attempts to resolve them to a single entity.

Entity resolution

When multiple identifiers are submitted, the engine attempts to determine whether they all belong to the same real-world person or organization. The result is communicated via resolution.status:
StatusMeaning
resolvedAll identifiers map to a single entity with high confidence.
ambiguousIdentifiers match more than one candidate entity. All candidates are returned ranked by confidence. The caller should review and select the correct match.
no_matchNo entity could be associated with any of the provided identifiers.
When status is ambiguous, the response contains multiple entries in the entities array, each with its own confidence score. Do not assume the first entry is correct; review each candidate.

Confidence scores

Confidence is expressed as a float between 0.0 and 1.0. It appears at three levels:
  • resolution.confidence: how certain the engine is that the returned entities represent the queried identifiers
  • entity.confidence: how certain the engine is that a specific candidate entity matches (only meaningful when status is ambiguous)
  • Per data-point confidence: every phone, address, social profile, and associate entry carries its own confidence score reflecting source quality and corroboration
Confidence scores are produced by the engine’s resolution model and reflect the strength and consistency of evidence across sources. They are not probabilities and should be interpreted qualitatively:
RangeInterpretation
0.90—1.0High confidence. Multiple independent sources agree.
0.70—0.89Moderate confidence. Some corroboration but gaps remain.
0.50—0.69Low confidence. Single source or conflicting signals.
< 0.50Speculative. Treat with significant caution.

Source coverage and partial results

The engine queries all configured connectors in parallel. If a connector is unreachable (e.g. due to a firewall egress rule), the query still completes with results from the connectors that were available. The source_coverage object in every response discloses which connectors were queried and which were not:
"source_coverage": {
  "available": ["connector_001", "connector_002", "connector_003"],
  "unavailable": ["connector_004"],
  "unavailability_reason": "egress_blocked"
}
This disclosure is also included in PDF report exports. The absence of a connector from the available list means that source was not checked for this query. Use GET /connectors to inspect connector status before running critical queries.

Lookup depth

The depth option controls how broadly the engine queries sources:
DepthBehavior
standardQueries primary connectors. Covers the majority of use cases with the lowest latency.
deepQueries all available connectors including secondary and derived sources. Higher latency and connector usage. Use when standard coverage is insufficient.