Closes #21870: Optimized Prefix and IP Range availability and utilization calculations by pheus · Pull Request #22211 · netbox-community/netbox

pheus · 2026-05-15T16:58:23Z

Fixes: #21870

This PR provides an alternative approach to #21954 for improving prefix detail performance. Instead of deferring expensive rendering work, it reduces the cost of the underlying IP availability and utilization calculations.

The main change is replacing netaddr.IPSet-based counting with SQL-side distinct host counts and Python-side interval merging. This avoids materializing large address sets in memory when rendering prefix and range summaries, especially for large prefixes with many child IPs.

The new implementation:

counts distinct IPAddress host values in the database
merges populated/utilized IPRange intervals before counting address space
avoids double-counting IPs covered by populated or utilized ranges
finds the first available IP by streaming sorted occupied intervals
updates the prefix detail panel to use available_ip_count
adds host-expression indexes for IPRange start/end addresses

This also includes a few related correctness improvements:

IPRange.get_child_ips() now compares host portions instead of full address/mask values
populated-range validation now uses the same host-based comparison
rebuild_prefixes() can accept querysets while preserving compatibility with existing callers
BaseIPField.get_prep_value() now handles zero addresses such as 0.0.0.0 and ::

Overall, this aims to improve the performance of the hot paths identified in #21870 while preserving the existing address-space semantics.

pheus · 2026-05-15T17:59:11Z

@claude /review Only report on issues of potential concern; we don't need commentary on anything that doesn't require further consideration.

github-actions · 2026-05-15T17:59:28Z

Claude finished @pheus's task in 4m 50s —— View job

PR Review: Optimized Prefix/IPRange Availability & Utilization

Gather context
Read changed files
Analyze for issues of concern
Post review findings

Issues of Concern

1. SQL Explosion in `count_distinct_ip_hosts_outside_intervals` (`utils.py:421–446`)

The function builds a Django Q() exclusion object by iterating over all merged intervals:

exclusion = Q()
for start, end in intervals:
    exclusion |= Q(
        host_address__gte=netaddr.IPAddress(start, version=version),
        host_address__lte=netaddr.IPAddress(end, version=version),
    )

Since IPRange records can't overlap (enforced by clean()), the number of merged intervals equals the number of IP ranges touching the prefix. For a prefix with hundreds of distinct mark_populated or mark_utilized ranges this generates a very long NOT (... OR ... OR ...) SQL predicate. At extreme scale this will hurt rather than help — the original IPSet approach materialized too much data, but this can generate a pathologically large SQL query. A NOT EXISTS subquery against a values list, or a lateral join, would scale better. Worth benchmarking with 500+ ranges before merging.

2. Semantic Drift Risk Between `get_available_ips()` and `available_ip_count` (`models/ip.py:458–517`)

The PR introduces available_ip_count as a faster alternative to len(get_available_ips()) but keeps both implementations. They compute the same logical quantity via different code paths, and nothing enforces their equivalence:

get_available_ips() builds an IPSet from ip.address.ip values (which implicitly deduplicates by host).
available_ip_count uses count_distinct_ip_hosts_outside_intervals with host annotation.

There is no test that asserts available_ip_count == get_available_ips().size for any case. A future change to one path — e.g. adding a new kind of "reserved" range or changing how pool bounds are handled — could silently diverge. At minimum, consider a shared test helper that compares both results, or at least document this invariant prominently.

3. Template Calls `available_ip_count` and `get_first_available_ip()` Separately (`prefix_addressing.html:33–64`)

{% with available_count=object.available_ip_count %}
  ...
  {% with first_available_ip=object.get_first_available_ip %}

Both properties independently query child IPs and child ranges. available_ip_count issues at least 2 DB queries (child ranges + child IPs); get_first_available_ip issues another 2 (same tables again). For a prefix detail page that also calls get_utilization and get_child_ips().count(), this is 8+ DB round-trips on the same objects. Prefetching or caching the querysets in get_first_available_ip / available_ip_count (e.g., checking self._prefetched_objects_cache) would cut this down, though that's an optimization for a follow-up.

4. `get_prep_value` Fix Has an Uncovered Edge (`fields.py:47`)

The fix correctly changes if not value: to if value is None or value == '': to handle 0.0.0.0 / ::. However, raw integer 0 — which is falsy and previously returned None — now falls through to str(self.to_python(0)). In netaddr, IPNetwork(0) resolves to 0.0.0.0/0, which would then fail the /0 mask validator elsewhere. This is an unlikely input path (Django's ORM doesn't pass raw ints to this method), but the behaviour change is undocumented. The added test_fields.py only tests None and ''; a test for value=0 (or a note that it's intentionally not handled) would close the gap.

5. `find_first_available_ip` Fast Path is Narrow (`utils.py:475`)

if intervals and intervals[0][0] <= first_int and intervals[0][1] >= last_int:
    return None

After merging, intervals[0] is the lowest-start interval. This fast path correctly fires when a single merged interval covers the entire usable span. But it does not fire when two adjacent merged intervals together cover the full span (e.g., they would have merged), so in practice this is only an early-exit for the single-contiguous-block case. This is fine logically (the subsequent sweep handles all other cases correctly), but the comment "the first merged interval already covers the entire usable span" could mislead a future reader into thinking multi-interval coverage is also caught here.

Replace netaddr.IPSet-based availability and utilization calculations with SQL-side distinct host counts and Python-side interval merging. This avoids materializing large address sets in memory when rendering prefix and range summaries. The new implementation: - counts distinct IPAddress host values in the database - merges populated/utilized IPRanges before counting address space - avoids double-counting IPs covered by populated/utilized ranges - finds the first available IP by streaming sorted occupied intervals - updates the prefix detail panel to use available_ip_count - adds host-expression indexes for IPRange start/end addresses Also update IPRange child-IP matching and populated-range validation to compare host portions instead of full address/mask values, preserve rebuild_prefixes() compatibility while allowing queryset input, and fix BaseIPField.get_prep_value() to handle zero addresses such as 0.0.0.0 and ::.

pheus · 2026-05-15T21:37:31Z

@claude /review

I agree with points 2, 4, and 5.

For point 2, I’ve added equivalence coverage between available_ip_count and get_available_ips().size for the important availability cases. That should help guard against future semantic drift while still keeping the optimized path separate from the existing IPSet implementation.

For BaseIPField.get_prep_value(), the raw integer 0 edge case is fair. I added a focused test and preserve the previous behavior for that unsupported/raw input path, while still allowing valid zero IP addresses such as 0.0.0.0 and ::.

For the find_first_available_ip() fast-path comment, I agree the logic is correct but the wording can be clearer. I've adjusted the comment to make it explicit that this only covers the case where one merged occupied interval spans the full usable range.

For point 1, I think the current interval-exclusion approach is still the right trade-off for this PR. The intervals are merged before building the predicate, and the expected hot path is many IPs with comparatively few populated/utilized ranges. A very high number of disjoint ranges could justify a different strategy later, but I’d prefer to keep this implementation readable unless benchmarks show that case is a real bottleneck.

For point 3, the template already avoids calling get_first_available_ip() when available_count is zero. When there is available space, the count and first-available calculations use different query shapes, and I’d rather not introduce a combined/cached summary object in this PR. That can be revisited separately if benchmarks show the remaining duplicate work is still significant.

github-actions · 2026-05-15T21:37:48Z

Claude finished @pheus's task in 1m 25s —— View job

Re-review: Verifying Addressed Concerns

Read updated files
Verify point 2: equivalence tests
Verify point 4: get_prep_value edge case
Verify point 5: fast-path comment
Assess points 1 and 3

Findings

Point 2 — Equivalence coverage: well done.

The assertAvailableIPCountMatchesIPSet helper is correctly placed and called across 13 new test cases covering the meaningful scenarios: distinct-host deduplication, populated/utilized ranges, IPv4 pool vs. non-pool, IPv6 with subnet-router anycast exclusion, overlapping ranges, VRF scoping, a fully-populated prefix, and a container prefix. That's thorough. Any future semantic drift between the two code paths will now be caught immediately.

Point 4 — get_prep_value fix: correct.

if value is None or value == '' or (type(value) is int and value == 0):

Using type(value) is int (exact type, not isinstance) is the right call here — it intentionally excludes booleans and netaddr subclasses, preserving legacy behaviour for raw-int 0 only. The four test cases in test_fields.py cover None, '', IPAddress('0.0.0.0'), IPAddress('::'), and raw 0, closing the gap completely.

Point 5 — Fast-path comment: addressed.

"one merged occupied interval covers the entire usable span" is unambiguous. No confusion for future readers.

Point 1 — SQL explosion: accepted trade-off.

The reasoning is sound for the expected workload. Merging intervals before building the predicate is the right mitigation, and the scenario where it degrades (hundreds of disjoint non-overlapping ranges) is unlikely in practice. Acceptable to revisit if benchmarks show otherwise.

Point 3 — Duplicate queries in template: accepted trade-off.

The existing short-circuit when available_count == 0 eliminates the common case where get_first_available_ip() would otherwise run unnecessarily. Deferring a combined cache object to a follow-up is the right call given the scope of this PR.

No remaining issues of concern. The three addressed points are handled correctly and the test coverage added is solid.

jeremystretch

I've done an initial review, but I'm finding it difficult to keep straight the hierarchy of all the new utility functions. It seems like a lot of them could be moved to model or queryset methods to keep things better organized. What do you think?

jeremystretch · 2026-05-26T18:42:56Z

+        # Use an explicit None / empty-string check; `not value` incorrectly treats
+        # the valid zero addresses 0.0.0.0 and :: as empty. Raw int 0 is preserved
+        # as "empty" for backward compatibility (Django's ORM does not pass it here).
+        if value is None or value == '' or (type(value) is int and value == 0):


This seems equivalent but simpler:

Suggested change

if value is None or value == '' or (type(value) is int and value == 0):

if value in (None, '', 0):

jeremystretch · 2026-05-26T18:51:31Z


        return available_ips

+    @property


Given that accessing this property triggers potentially expensive database queries (get_child_ranges(), get_child_ips()), we should probably implement it as a regular method, e.g. get_available_ip_count(). (Alternatively, we could use @cached_property but then we need to worry about stale value.)

jeremystretch · 2026-05-26T18:52:28Z

+        get_available_ips() remains available for callers that need the actual set.
+        """
+        first_ip, last_ip = get_usable_ip_bounds(self)
+        usable_size = int(last_ip) - int(first_ip) + 1


Might be worth exposing this as a cached property on Prefix, but we also want to be careful to avoid confusion with self.prefix.size.

jeremystretch · 2026-05-26T19:04:36Z

+    return sum(end - start + 1 for start, end in intervals)
+
+
+def count_distinct_ip_hosts_outside_intervals(queryset, intervals, version):


Can this be collapsed into count_distinct_ip_hosts() above by making intervals and version optional?

jeremystretch · 2026-05-26T19:09:31Z

+        return netaddr.IPNetwork(text).ip
+
+
+def annotate_host_address(queryset):


It might make sense to introduce a custom subclass of QuerySet for the IPAddress model to hold some of these functions, so that they can be executed on a queryset directly.

jeremystretch · 2026-05-26T19:11:56Z

+    )
+
+
+def filter_ip_hosts_between(queryset, first_ip, last_ip):


Did you happen to explore implementing this as a lookup? It would be nice to do e.g. .filter(address__host_between=[x, y]) if feasible.

jeremystretch · 2026-05-26T19:28:13Z

+        # Infer the Prefix model from the queryset/list so this helper does not need
+        # a model import at module scope.
+        prefix_model = getattr(prefix_list, 'model', None) or prefix_list[0].__class__


Could we use apps.get_model() here instead?

jeremystretch · 2026-05-26T19:30:45Z

+
+        return max(self.size - count_distinct_ip_hosts(self.get_child_ips()), 0)
+
+    @property


We probably want to keep this as a method, both to avoid a breaking change and because it effects a database query.

jeremystretch · 2026-05-26T19:34:32Z

+    @property
+    def available_ip_count(self):


Same concern here as with available_ip_count() on IPRange. Just calling out for consistency.

This comment was marked as outdated.

Sign in to view

pheus force-pushed the 21870-optimize-prefix-view-alt branch from 331d99c to 031aad9 Compare May 15, 2026 17:53

This comment was marked as outdated.

Sign in to view

pheus force-pushed the 21870-optimize-prefix-view-alt branch from 031aad9 to 80fa30c Compare May 15, 2026 21:24

pheus force-pushed the 21870-optimize-prefix-view-alt branch from 80fa30c to b16dd9e Compare May 15, 2026 21:36

pheus marked this pull request as ready for review May 15, 2026 21:48

pheus requested a review from jeremystretch May 15, 2026 21:48

jeremystretch requested changes May 26, 2026

View reviewed changes

	if value is None or value == '' or (type(value) is int and value == 0):
	if value in (None, '', 0):

		return sum(end - start + 1 for start, end in intervals)


		def count_distinct_ip_hosts_outside_intervals(queryset, intervals, version):

		return netaddr.IPNetwork(text).ip


		def annotate_host_address(queryset):


		return max(self.size - count_distinct_ip_hosts(self.get_child_ips()), 0)

		@property

Conversation

pheus commented May 15, 2026

Fixes: #21870

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

pheus commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Optimized Prefix/IPRange Availability & Utilization

Issues of Concern

1. SQL Explosion in count_distinct_ip_hosts_outside_intervals (utils.py:421–446)

2. Semantic Drift Risk Between get_available_ips() and available_ip_count (models/ip.py:458–517)

3. Template Calls available_ip_count and get_first_available_ip() Separately (prefix_addressing.html:33–64)

4. get_prep_value Fix Has an Uncovered Edge (fields.py:47)

5. find_first_available_ip Fast Path is Narrow (utils.py:475)

Uh oh!

pheus commented May 15, 2026

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-review: Verifying Addressed Concerns

Findings

Uh oh!

jeremystretch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 15, 2026 •

edited

Loading

1. SQL Explosion in `count_distinct_ip_hosts_outside_intervals` (`utils.py:421–446`)

2. Semantic Drift Risk Between `get_available_ips()` and `available_ip_count` (`models/ip.py:458–517`)

3. Template Calls `available_ip_count` and `get_first_available_ip()` Separately (`prefix_addressing.html:33–64`)

4. `get_prep_value` Fix Has an Uncovered Edge (`fields.py:47`)

5. `find_first_available_ip` Fast Path is Narrow (`utils.py:475`)

github-actions Bot commented May 15, 2026 •

edited

Loading