close
Skip to content

Closes #21870: Optimized Prefix and IP Range availability and utilization calculations#22211

Open
pheus wants to merge 1 commit into
mainfrom
21870-optimize-prefix-view-alt
Open

Closes #21870: Optimized Prefix and IP Range availability and utilization calculations#22211
pheus wants to merge 1 commit into
mainfrom
21870-optimize-prefix-view-alt

Conversation

@pheus
Copy link
Copy Markdown
Contributor

@pheus pheus commented May 15, 2026

Fixes: #21870

This PR provides an alternative approach to #21954 for improving prefix detail performance. Instead of deferring expensive rendering work, it reduces the cost of the underlying IP availability and utilization calculations.

The main change is replacing netaddr.IPSet-based counting with SQL-side distinct host counts and Python-side interval merging. This avoids materializing large address sets in memory when rendering prefix and range summaries, especially for large prefixes with many child IPs.

The new implementation:

  • counts distinct IPAddress host values in the database
  • merges populated/utilized IPRange intervals before counting address space
  • avoids double-counting IPs covered by populated or utilized ranges
  • finds the first available IP by streaming sorted occupied intervals
  • updates the prefix detail panel to use available_ip_count
  • adds host-expression indexes for IPRange start/end addresses

This also includes a few related correctness improvements:

  • IPRange.get_child_ips() now compares host portions instead of full address/mask values
  • populated-range validation now uses the same host-based comparison
  • rebuild_prefixes() can accept querysets while preserving compatibility with existing callers
  • BaseIPField.get_prep_value() now handles zero addresses such as 0.0.0.0 and ::

Overall, this aims to improve the performance of the hot paths identified in #21870 while preserving the existing address-space semantics.

@pheus

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@pheus pheus force-pushed the 21870-optimize-prefix-view-alt branch from 331d99c to 031aad9 Compare May 15, 2026 17:53
@pheus

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@pheus
Copy link
Copy Markdown
Contributor Author

pheus commented May 15, 2026

@claude /review Only report on issues of potential concern; we don't need commentary on anything that doesn't require further consideration.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Claude finished @pheus's task in 4m 50s —— View job


PR Review: Optimized Prefix/IPRange Availability & Utilization

  • Gather context
  • Read changed files
  • Analyze for issues of concern
  • Post review findings

Issues of Concern

1. SQL Explosion in count_distinct_ip_hosts_outside_intervals (utils.py:421–446)

The function builds a Django Q() exclusion object by iterating over all merged intervals:

exclusion = Q()
for start, end in intervals:
    exclusion |= Q(
        host_address__gte=netaddr.IPAddress(start, version=version),
        host_address__lte=netaddr.IPAddress(end, version=version),
    )

Since IPRange records can't overlap (enforced by clean()), the number of merged intervals equals the number of IP ranges touching the prefix. For a prefix with hundreds of distinct mark_populated or mark_utilized ranges this generates a very long NOT (... OR ... OR ...) SQL predicate. At extreme scale this will hurt rather than help — the original IPSet approach materialized too much data, but this can generate a pathologically large SQL query. A NOT EXISTS subquery against a values list, or a lateral join, would scale better. Worth benchmarking with 500+ ranges before merging.

2. Semantic Drift Risk Between get_available_ips() and available_ip_count (models/ip.py:458–517)

The PR introduces available_ip_count as a faster alternative to len(get_available_ips()) but keeps both implementations. They compute the same logical quantity via different code paths, and nothing enforces their equivalence:

  • get_available_ips() builds an IPSet from ip.address.ip values (which implicitly deduplicates by host).
  • available_ip_count uses count_distinct_ip_hosts_outside_intervals with host annotation.

There is no test that asserts available_ip_count == get_available_ips().size for any case. A future change to one path — e.g. adding a new kind of "reserved" range or changing how pool bounds are handled — could silently diverge. At minimum, consider a shared test helper that compares both results, or at least document this invariant prominently.

3. Template Calls available_ip_count and get_first_available_ip() Separately (prefix_addressing.html:33–64)

{% with available_count=object.available_ip_count %}
  ...
  {% with first_available_ip=object.get_first_available_ip %}

Both properties independently query child IPs and child ranges. available_ip_count issues at least 2 DB queries (child ranges + child IPs); get_first_available_ip issues another 2 (same tables again). For a prefix detail page that also calls get_utilization and get_child_ips().count(), this is 8+ DB round-trips on the same objects. Prefetching or caching the querysets in get_first_available_ip / available_ip_count (e.g., checking self._prefetched_objects_cache) would cut this down, though that's an optimization for a follow-up.

4. get_prep_value Fix Has an Uncovered Edge (fields.py:47)

The fix correctly changes if not value: to if value is None or value == '': to handle 0.0.0.0 / ::. However, raw integer 0 — which is falsy and previously returned None — now falls through to str(self.to_python(0)). In netaddr, IPNetwork(0) resolves to 0.0.0.0/0, which would then fail the /0 mask validator elsewhere. This is an unlikely input path (Django's ORM doesn't pass raw ints to this method), but the behaviour change is undocumented. The added test_fields.py only tests None and ''; a test for value=0 (or a note that it's intentionally not handled) would close the gap.

5. find_first_available_ip Fast Path is Narrow (utils.py:475)

if intervals and intervals[0][0] <= first_int and intervals[0][1] >= last_int:
    return None

After merging, intervals[0] is the lowest-start interval. This fast path correctly fires when a single merged interval covers the entire usable span. But it does not fire when two adjacent merged intervals together cover the full span (e.g., they would have merged), so in practice this is only an early-exit for the single-contiguous-block case. This is fine logically (the subsequent sweep handles all other cases correctly), but the comment "the first merged interval already covers the entire usable span" could mislead a future reader into thinking multi-interval coverage is also caught here.


@pheus pheus force-pushed the 21870-optimize-prefix-view-alt branch from 031aad9 to 80fa30c Compare May 15, 2026 21:24
Replace netaddr.IPSet-based availability and utilization calculations
with SQL-side distinct host counts and Python-side interval merging.
This avoids materializing large address sets in memory when rendering
prefix and range summaries.

The new implementation:
- counts distinct IPAddress host values in the database
- merges populated/utilized IPRanges before counting address space
- avoids double-counting IPs covered by populated/utilized ranges
- finds the first available IP by streaming sorted occupied intervals
- updates the prefix detail panel to use available_ip_count
- adds host-expression indexes for IPRange start/end addresses

Also update IPRange child-IP matching and populated-range validation to
compare host portions instead of full address/mask values, preserve
rebuild_prefixes() compatibility while allowing queryset input, and fix
BaseIPField.get_prep_value() to handle zero addresses such as 0.0.0.0
and ::.
@pheus pheus force-pushed the 21870-optimize-prefix-view-alt branch from 80fa30c to b16dd9e Compare May 15, 2026 21:36
@pheus
Copy link
Copy Markdown
Contributor Author

pheus commented May 15, 2026

@claude /review

I agree with points 2, 4, and 5.

For point 2, I’ve added equivalence coverage between available_ip_count and get_available_ips().size for the important availability cases. That should help guard against future semantic drift while still keeping the optimized path separate from the existing IPSet implementation.

For BaseIPField.get_prep_value(), the raw integer 0 edge case is fair. I added a focused test and preserve the previous behavior for that unsupported/raw input path, while still allowing valid zero IP addresses such as 0.0.0.0 and ::.

For the find_first_available_ip() fast-path comment, I agree the logic is correct but the wording can be clearer. I've adjusted the comment to make it explicit that this only covers the case where one merged occupied interval spans the full usable range.

For point 1, I think the current interval-exclusion approach is still the right trade-off for this PR. The intervals are merged before building the predicate, and the expected hot path is many IPs with comparatively few populated/utilized ranges. A very high number of disjoint ranges could justify a different strategy later, but I’d prefer to keep this implementation readable unless benchmarks show that case is a real bottleneck.

For point 3, the template already avoids calling get_first_available_ip() when available_count is zero. When there is available space, the count and first-available calculations use different query shapes, and I’d rather not introduce a combined/cached summary object in this PR. That can be revisited separately if benchmarks show the remaining duplicate work is still significant.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Claude finished @pheus's task in 1m 25s —— View job


Re-review: Verifying Addressed Concerns

  • Read updated files
  • Verify point 2: equivalence tests
  • Verify point 4: get_prep_value edge case
  • Verify point 5: fast-path comment
  • Assess points 1 and 3

Findings

Point 2 — Equivalence coverage: well done.

The assertAvailableIPCountMatchesIPSet helper is correctly placed and called across 13 new test cases covering the meaningful scenarios: distinct-host deduplication, populated/utilized ranges, IPv4 pool vs. non-pool, IPv6 with subnet-router anycast exclusion, overlapping ranges, VRF scoping, a fully-populated prefix, and a container prefix. That's thorough. Any future semantic drift between the two code paths will now be caught immediately.

Point 4 — get_prep_value fix: correct.

if value is None or value == '' or (type(value) is int and value == 0):

Using type(value) is int (exact type, not isinstance) is the right call here — it intentionally excludes booleans and netaddr subclasses, preserving legacy behaviour for raw-int 0 only. The four test cases in test_fields.py cover None, '', IPAddress('0.0.0.0'), IPAddress('::'), and raw 0, closing the gap completely.

Point 5 — Fast-path comment: addressed.

"one merged occupied interval covers the entire usable span" is unambiguous. No confusion for future readers.

Point 1 — SQL explosion: accepted trade-off.

The reasoning is sound for the expected workload. Merging intervals before building the predicate is the right mitigation, and the scenario where it degrades (hundreds of disjoint non-overlapping ranges) is unlikely in practice. Acceptable to revisit if benchmarks show otherwise.

Point 3 — Duplicate queries in template: accepted trade-off.

The existing short-circuit when available_count == 0 eliminates the common case where get_first_available_ip() would otherwise run unnecessarily. Deferring a combined cache object to a follow-up is the right call given the scope of this PR.


No remaining issues of concern. The three addressed points are handled correctly and the test coverage added is solid.

@pheus pheus marked this pull request as ready for review May 15, 2026 21:48
@pheus pheus requested a review from jeremystretch May 15, 2026 21:48
Copy link
Copy Markdown
Member

@jeremystretch jeremystretch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done an initial review, but I'm finding it difficult to keep straight the hierarchy of all the new utility functions. It seems like a lot of them could be moved to model or queryset methods to keep things better organized. What do you think?

Comment thread netbox/ipam/fields.py
# Use an explicit None / empty-string check; `not value` incorrectly treats
# the valid zero addresses 0.0.0.0 and :: as empty. Raw int 0 is preserved
# as "empty" for backward compatibility (Django's ORM does not pass it here).
if value is None or value == '' or (type(value) is int and value == 0):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems equivalent but simpler:

Suggested change
if value is None or value == '' or (type(value) is int and value == 0):
if value in (None, '', 0):

Comment thread netbox/ipam/models/ip.py

return available_ips

@property
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that accessing this property triggers potentially expensive database queries (get_child_ranges(), get_child_ips()), we should probably implement it as a regular method, e.g. get_available_ip_count(). (Alternatively, we could use @cached_property but then we need to worry about stale value.)

Comment thread netbox/ipam/models/ip.py
get_available_ips() remains available for callers that need the actual set.
"""
first_ip, last_ip = get_usable_ip_bounds(self)
usable_size = int(last_ip) - int(first_ip) + 1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth exposing this as a cached property on Prefix, but we also want to be careful to avoid confusion with self.prefix.size.

Comment thread netbox/ipam/utils.py
return sum(end - start + 1 for start, end in intervals)


def count_distinct_ip_hosts_outside_intervals(queryset, intervals, version):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be collapsed into count_distinct_ip_hosts() above by making intervals and version optional?

Comment thread netbox/ipam/utils.py
return netaddr.IPNetwork(text).ip


def annotate_host_address(queryset):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to introduce a custom subclass of QuerySet for the IPAddress model to hold some of these functions, so that they can be executed on a queryset directly.

Comment thread netbox/ipam/utils.py
)


def filter_ip_hosts_between(queryset, first_ip, last_ip):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you happen to explore implementing this as a lookup? It would be nice to do e.g. .filter(address__host_between=[x, y]) if feasible.

Comment thread netbox/ipam/utils.py
Comment on lines +98 to +100
# Infer the Prefix model from the queryset/list so this helper does not need
# a model import at module scope.
prefix_model = getattr(prefix_list, 'model', None) or prefix_list[0].__class__
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use apps.get_model() here instead?

Comment thread netbox/ipam/models/ip.py

return max(self.size - count_distinct_ip_hosts(self.get_child_ips()), 0)

@property
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to keep this as a method, both to avoid a breaking change and because it effects a database query.

Comment thread netbox/ipam/models/ip.py
Comment on lines +795 to +796
@property
def available_ip_count(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern here as with available_ip_count() on IPRange. Just calling out for consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance issues in prefix views with large-scale data

2 participants