Add issue deduplicator workflow (#4628)

It's a bit hand-holdy in that it pre-downloads issue list but that keeps
codex running in read-only no-network mode.
This commit is contained in:
pakrym-oai
2025-10-02 14:36:33 -07:00
committed by GitHub
parent f895d4cbb3
commit 62cc8a4b8d
3 changed files with 108 additions and 0 deletions

19
.github/prompts/issue-deduplicator.txt vendored Normal file
View File

@@ -0,0 +1,19 @@
You are an assistant that triages new GitHub issues by identifying potential duplicates.
You will receive the following JSON files located in the current working directory:
- `codex-current-issue.json`: JSON object describing the newly created issue (fields: number, title, body).
- `codex-existing-issues.json`: JSON array of recent issues (each element includes number, title, body, createdAt).
Instructions:
- Load both files as JSON and review their contents carefully.
- Compare the current issue against the existing issues to find up to five that appear to describe the same underlying problem or request.
- Only consider an issue a potential duplicate if there is a clear overlap in symptoms, feature requests, reproduction steps, or error messages.
- Prioritize newer issues when similarity is comparable.
- Ignore pull requests and issues whose similarity is tenuous.
- When unsure, prefer returning fewer matches.
Output requirements:
- Respond with a JSON array of issue numbers (integers), ordered from most likely duplicate to least.
- Include at most five numbers.
- If you find no plausible duplicates, respond with `[]`.
- Do not emit any additional commentary, text, or keys beyond the JSON array.

View File

@@ -0,0 +1,88 @@
name: Issue Deduplicator
on:
issues:
types:
# - opened - disabled while testing
- labeled
jobs:
gather-duplicates:
name: Identify potential duplicates
if: ${{ github.event.action == 'opened' || (github.event.action == 'labeled' && github.event.label.name == 'codex-deduplicate') }}
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
codex_output: ${{ steps.codex.outputs.final_message }}
steps:
- uses: actions/checkout@v4
- name: Prepare Codex inputs
env:
GH_TOKEN: ${{ github.token }}
run: |
set -eo pipefail
CURRENT_ISSUE_FILE=codex-current-issue.json
EXISTING_ISSUES_FILE=codex-existing-issues.json
gh issue list --repo "${{ github.repository }}" \
--json number,title,body,createdAt \
--limit 1000 \
--state all \
--search "sort:created-desc" \
| jq '.' \
> "$EXISTING_ISSUES_FILE"
printf '%s' '${{ toJson(github.event.issue) }}' \
| jq '{number, title, body}' \
> "$CURRENT_ISSUE_FILE"
- id: codex
uses: openai/codex-action@main
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
prompt_file: .github/prompts/issue-deduplicator.txt
require_repo_write: false
comment-on-issue:
name: Comment with potential duplicates
needs: gather-duplicates
if: ${{ needs.gather-duplicates.result != 'skipped' }}
runs-on: ubuntu-latest
permissions:
contents: read
issues: write
steps:
- name: Comment on issue
uses: actions/github-script@v7
env:
CODEX_OUTPUT: ${{ needs.gather-duplicates.outputs.codex_output }}
with:
github-token: ${{ github.token }}
script: |
let numbers;
try {
numbers = JSON.parse(process.env.CODEX_OUTPUT);
} catch (error) {
core.info(`Codex output was not valid JSON. Raw output: ${raw}`);
return;
}
const lines = ['Potential duplicates detected:', ...numbers.map((value) => `- #${value}`)];
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.issue.number,
body: lines.join("\n"),
});
- name: Remove codex-deduplicate label
if: ${{ always() && github.event.action == 'labeled' && github.event.label.name == 'codex-deduplicate' }}
env:
GH_TOKEN: ${{ github.token }}
run: |
gh issue edit "${{ github.event.issue.number }}" --remove-label codex-deduplicate || true
echo "Attempted to remove label: codex-deduplicate"

View File

@@ -28,6 +28,7 @@ jobs:
with:
openai_api_key: ${{ secrets.CODEX_OPENAI_API_KEY }}
prompt_file: .github/prompts/issue-labeler.txt
require_repo_write: false
apply-labels:
name: Apply labels from Codex output