Update RateSampler to use the DD implementation#4
Conversation
| * @return true when the trace/spans has to be reported/written | ||
| */ | ||
| boolean sample(Span span); | ||
| boolean sample(DDSpan span); |
There was a problem hiding this comment.
I changed it because I needed to use span.getTraceId() in RateSampler.sampler. I'm not sure it was required/the best way.
There was a problem hiding this comment.
Ok... You start with a touchy subject.
Normally this is better to never refer to implemented classes in interfaces. This is the kind of things Jaa developers hate seeing.
So you have 2 options here as far as I know:
1- You can cast the Span to DDSpan while in the implement (after checking that ("span implements DDSpan")). A cast is generally not appreciated as well. As it cost execution time and it generally means that your architecture is wrong somewhere...
2- Otherwise you can use generics:
/**
* Main interface to sample a collection of traces.
*/
public interface Sampler<ImplementedSpan extends Span> {
/**
* Sample a collection of traces based on the parent span
*
* @param span the parent span with its context
* @return true when the trace/spans has to be reported/written
*/
boolean sample(ImplementedSpan span);
}
Then an implementation becomes:
/**
* Sampler that always says yes...
*/
public class AllSampler implements Sampler<DDSpan> {
@Override
public boolean sample(DDSpan span) {
return true;
}
}
| when(mockSpan.getTraceId()).thenAnswer(new Answer<Object>() { | ||
| @Override | ||
| public Object answer(InvocationOnMock s) throws Throwable { | ||
| return ThreadLocalRandom.current().nextLong(Long.MAX_VALUE); |
There was a problem hiding this comment.
Is that an okay way of generating a random int64?
I saw we used a nanoseconds method for now, but we need to generate uniformly generated IDs over the full [0, MAX_VALUE] spectrum.
There was a problem hiding this comment.
I did not know this method actually which has the good taste to provide long values between 0 and MAX.
I was about about to ask you why ThreadLocalRandom but... on Random... the nextLong garranteed to be positive. So ok, good idea.
| public class RateSampler implements Sampler { | ||
|
|
||
| /** | ||
| * Maximum value for the traceId. FIXME: should be real uint64. |
There was a problem hiding this comment.
The non-compatibility with uint64 / values in [2^63-1, 2^64-1] will be an issue at some point ; but for now we can keep it for later.
| long traceId = span.getTraceId(); | ||
| // Logical shift to move from signed to unsigned number. TODO: check that this is consistent with our Go implementation. | ||
| boolean sample = ((span.getTraceId() * RateSampler.SAMPLER_HASHER) >>> 1) < this.sampleRate*RateSampler.MAX_TRACE_ID_DOUBLE; | ||
| logger.debug("traceId: {} sampled: {}", traceId, sample); |
There was a problem hiding this comment.
Not sure, you should probably unit test it against a set of ids you know the exepected results.
|
My two cents:
|
|
I'm closing this one for the moment |
Pipeline #4 hit 'FATAL: AGENT_JAR not found' — stageMainDist apparently doesn't build dd-java-agent/build/libs/dd-java-agent-*-SNAPSHOT.jar on its own; the existing :smokeTest target was what dragged it in. Explicitly add :dd-java-agent:shadowJar to the gradle target. Also added find commands to discover where the agent jar actually lands in the CI workspace (may be in a non-default path due to build-cache).
#4: PropertyCardinalityHandler and TagCardinalityHandler are only consumed within this package; drop `public` from the class declarations, constructors, and methods. They're package-private now. #6: Add tests that lock down the EMPTY-on-null contract that the rest of the package depends on: - CardinalityHandlerTest covers both handlers: register(null) -> EMPTY, and registering null repeatedly doesn't consume the cardinality budget. - AggregateEntryTest covers the entry shape: optional fields built from a snapshot with null inputs resolve to EMPTY; populated optional fields carry their value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement the rate sampling the way it works (in other clients / in the Agent), which is required for our distributed sampling and our adjusted statistics to work properly.