Skip to content

Conversation

@amanfcp
Copy link
Contributor

@amanfcp amanfcp commented Jan 16, 2026

Description:

Changes:

  • Add UpdateLink function pointer to Chunk struct for source-specific link updates
  • Implement generateLink and updateLink functions in GitHub source for all repo types
  • Extract common URL manipulation helpers to pkg/giturl/helpers.go
  • Update engine.go to call chunk.UpdateLink directly instead of UpdateLink function
  • Mark SupportsLineNumbers as deprecated (used for backward compatibility only)
  • Update Git source to pass UpdateLink function pointer from configuration
  • Add unit tests for GitHub link generation and updating

End Goal & Future Work

Current State (This PR)

  • GitHub only: Implements source-specific link generation via function pointers
  • Other sources: Fall back to centralized giturl.GenerateLink() / UpdateLink()
  • Engine: Checks chunk.SourceUpdateLink != nil first, then falls back to SupportsLineNumbers()

Final State (After All Iterations)

// Sources own their link logic
github.Init() → sets UpdateLinkFunc: s.updateLink
gitlab.Init() → sets UpdateLinkFunc: s.updateLink
bitbucket.Init() → sets UpdateLinkFunc: s.updateLink

...

// Engine stays simple
if chunk.SourceUpdateLink != nil {
    updatedLink := chunk.SourceUpdateLink(link, line)
    // Done. No provider detection needed.
}

What Gets Removed

  • giturl.GenerateLink() - each source implements its own
  • giturl.UpdateLinkLineNumber() - replaced by source functions
  • SupportsLineNumbers() - replaced by nil check
  • Provider detection logic in giturl package

Note:

We can also remove the giturl package entirely and move the helper functions elsewhere.

Once this gets approved, we can iteratively implement this on all git flavored sources and remove the deprecated functions entirely.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@amanfcp amanfcp requested a review from a team January 16, 2026 07:46
@amanfcp amanfcp requested review from a team as code owners January 16, 2026 07:46
@amanfcp
Copy link
Contributor Author

amanfcp commented Jan 16, 2026

A mismatch was identified between how line number support is defined and how metadata links are handled. SupportsLineNumbers() indicates that 9 source types support line numbers, but updateMetaDataLink() only implements logic for 5 metadata types. As a result, sources like GERRIT are marked as supported but have no corresponding metadata handling, causing them to fall through to the default error case.

Comment on lines +1204 to +1213
} else if SupportsLineNumbers(data.chunk.SourceType) {
// Fallback to centralized link updating for sources that haven't migrated yet
// (GitLab, Bitbucket, Azure Repos, etc.)
copyChunk := data.chunk
copyMetaDataClone := proto.Clone(data.chunk.SourceMetadata)
if copyMetaData, ok := copyMetaDataClone.(*source_metadatapb.MetaData); ok {
copyChunk.SourceMetadata = copyMetaData
}
fragStart, mdLine, link := FragmentFirstLineAndLink(&copyChunk)
ignoreLinePresent = SetResultLineNumber(&copyChunk, &res, fragStart, mdLine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating the logic for this else if block, can we add an || in the first if condition? Then when calling the link update function, we can add a smaller if else check.

Copy link
Contributor Author

@amanfcp amanfcp Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the duplication is messy but I was just trying to avoid nested ifs and making as few changes as possible in the previous flow (since it's temporary code if this gets approved)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplication like this can cause maintenance issues. This block already has one nested if so another one wouldn't increases the complexity level IMO

Comment on lines +22 to +36
// IsGistURL returns true if the repository URL is a GitHub gist.
func IsGistURL(repo string) bool {
return strings.Contains(repo, "gist.github.com")
}

// IsWikiURL returns true if the repository URL is a GitHub wiki.
func IsWikiURL(repo string) bool {
return strings.HasSuffix(repo, ".wiki.git")
}

// TrimWikiSuffix removes the .wiki.git suffix from a repository URL.
func TrimWikiSuffix(repo string) string {
return strings.TrimSuffix(repo, ".wiki.git")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it makes sense to have these Github-specific helpers in the giturl package. If they are only used by the Github source, can we move them there?

// This function is provided by the source at initialization time and allows
// the engine to update links without knowing provider-specific formats.
// Can be nil for sources that don't support link updating.
SourceUpdateLink func(link string, line int64) string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the name of the field. I feel like it doesn't clearly explain the purpose. If it specifically deals with line numbers, can we call it UpdateLinkLineNumber?

Copy link
Contributor Author

@amanfcp amanfcp Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I intentionally added Source prefix here like other source specific fields in this struct. SourceUpdateLinkLineNumber seems a bit unnecessarily specific don't you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just felt that SourceUpdateLink is a very generic term and doesn't clearly define what sort of updation will happen. But on the other hand the function signature takes in line int64 so I guess we're fine...

Copy link
Contributor

@mustansir14 mustansir14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. A few minor suggestions/concerns.

@mustansir14
Copy link
Contributor

Also once this approach is finalized and merged in, we should create tickets to migrate other sources to this new pattern as well, with the purpose of completely removing the generic link generation/updation logic.

@amanfcp
Copy link
Contributor Author

amanfcp commented Jan 16, 2026

This test is failing. Not sure why we have a fixed size match so just putting it out here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants