rubric_based_final_response_quality_v1 is hard to use for factual evaluation of google_search agents because its judge prompt requires tool_response evidence · google/adk-python@7ad7994