You’re given an integer array nums.
A subarray is balanced if:
#distinct even values in the subarray
= #distinct odd values in the subarray
Return the length of the longest balanced subarray.
This “distinct” part is what makes the problem feel unfair at first: duplicates don’t increase the count, so a normal prefix-sum trick doesn’t work out-of-the-box.
The mindset shift that makes it solvable
Let’s encode every value as a tiny “vote”:
- odd value → +1
- even value → -1
If a subarray has equal distinct odds and evens, then:
(distinct odds) − (distinct evens) = 0
So we want the longest subarray whose distinct-vote sum is 0.
But again: each number should only vote once per subarray, no matter how many times it appears.
That’s the whole trick.
A super practical way to think about “distinct” here
Fix a right endpoint i (we scan left → right).
For every distinct number we’ve seen so far, imagine placing a bookmark on its most recent (last) occurrence.
Example: if we’ve seen 5 at positions 2 and 7, the bookmark for 5 is at 7.
Now here’s the key observation:
For a cut position j (meaning subarray is (j+1 ... i))
A number x appears in (j+1 ... i) iff its last occurrence is after j.
So if we know the “vote sum of bookmarks at or before j”, we can subtract it from the total vote sum and instantly know the vote sum inside (j+1 ... i).
That leads to this clean identity:
- Let
now= sum of votes of all distinct numbers seen up toi - Let
A[j]= sum of votes of numbers whose last occurrence ≤ j
Then:
voteSumDistinct( (j+1 … i) ) =
now - A[j]
Balanced means vote sum is 0 ⇒ now - A[j] = 0 ⇒
A[j] = now
So for each i, we want the smallest j with A[j] == now (smallest j gives the longest length i - j).
How do we maintain A[j] efficiently?
A[j] depends on where the last occurrence of each value currently is.
If a number x currently has last occurrence at position p, then x contributes its vote to all A[j] where j >= p.
That’s a suffix add:
- add
vote(x)to range[p ... n]
Now when we move from p to a new last occurrence i, we:
- remove its old suffix contribution
[p ... n] - add the new suffix contribution
[i ... n]
This is exactly what the standard segment-tree + lazy propagation solution is doing.
Why the segment tree query works (the subtle but important detail)
We need to find the earliest j where A[j] == now.
A neat property holds here:
- At any moment, each index can be the last occurrence of at most one value (because only one value sits at that index).
- So as
jincreases by 1,A[j]changes by -1, 0, or +1 (never jumps by 2+).
That means within any segment, values are “continuous enough” that checking min ≤ target ≤ max lets us safely descend left-first to find the earliest exact match—exactly how the binary-search-on-segtree method works.
Step-by-step on a small example
nums = [2, 5, 4, 3]
Votes:
- 2 (even) = -1
- 5 (odd) = +1
- 4 (even) = -1
- 3 (odd) = +1
As we scan rightward, now tracks the distinct vote sum in the prefix.
We update suffix ranges based on last occurrences so A[j] always means “vote sum of values whose last occurrence is ≤ j”.
Whenever we find A[j] == now, the subarray (j+1 ... i) is balanced, and we try to maximize i - j.
At the final index, the whole array becomes balanced (2 distinct evens {2,4} and 2 distinct odds {5,3}), so answer is 4.
Complexity
- Each element does O(log n) segment tree updates and a O(log n) query.
- Total: O(n log n) time, O(n) space.
Clean Python implementation (explained, but still fast)
from typing import List
class SegTree:
def __init__(self, n: int):
# We store indices 0..n (inclusive), so length = n+1
self.n = n
size = 4 * (n + 1)
self.mn = [0] * size
self.mx = [0] * size
self.lazy = [0] * size
self._build(1, 0, n)
def _build(self, u: int, l: int, r: int) -> None:
self.mn[u] = self.mx[u] = self.lazy[u] = 0
if l == r:
return
m = (l + r) // 2
self._build(u * 2, l, m)
self._build(u * 2 + 1, m + 1, r)
def _apply(self, u: int, delta: int) -> None:
self.mn[u] += delta
self.mx[u] += delta
self.lazy[u] += delta
def _push(self, u: int) -> None:
if self.lazy[u] != 0:
d = self.lazy[u]
self._apply(u * 2, d)
self._apply(u * 2 + 1, d)
self.lazy[u] = 0
def add_range(self, ql: int, qr: int, delta: int) -> None:
self._add(1, 0, self.n, ql, qr, delta)
def _add(self, u: int, l: int, r: int, ql: int, qr: int, delta: int) -> None:
if ql <= l and r <= qr:
self._apply(u, delta)
return
self._push(u)
m = (l + r) // 2
if ql <= m:
self._add(u * 2, l, m, ql, qr, delta)
if qr > m:
self._add(u * 2 + 1, m + 1, r, ql, qr, delta)
self.mn[u] = min(self.mn[u * 2], self.mn[u * 2 + 1])
self.mx[u] = max(self.mx[u * 2], self.mx[u * 2 + 1])
def find_first_equal(self, target: int) -> int:
# target is guaranteed to exist because A[n] == now always holds
return self._find(1, 0, self.n, target)
def _find(self, u: int, l: int, r: int, target: int) -> int:
if l == r:
return l
self._push(u)
left = u * 2
right = u * 2 + 1
m = (l + r) // 2
# If target is inside left child's [min, max], it must exist there
if self.mn[left] <= target <= self.mx[left]:
return self._find(left, l, m, target)
return self._find(right, m + 1, r, target)
class Solution:
def longestBalanced(self, nums: List[int]) -> int:
n = len(nums)
st = SegTree(n)
last = {} # value -> last index (1-based)
now = 0 # sum of (+1 odd, -1 even) over DISTINCT values seen so far
ans = 0
# Use 1-based positions for easier "cut index" math with 0..n
for i, x in enumerate(nums, start=1):
det = 1 if (x & 1) else -1
if x not in last:
# first time we see x -> it becomes a new distinct value
now += det
else:
# x's last occurrence moves: remove old suffix contribution
st.add_range(last[x], n, -det)
last[x] = i
# add new suffix contribution at i
st.add_range(i, n, det)
# Find earliest cut j where A[j] == now => subarray (j+1..i) is balanced
j = st.find_first_equal(now)
ans = max(ans, i - j)
return ans