-
Notifications
You must be signed in to change notification settings - Fork 9
Convert relative image links to base64 data URIs (markdown and HTML) #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
|
@copilot Use this markdown file to test https://github.com/wimpysworld/deb-get/blob/main/README.md |
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
Tested with the deb-get README.md. Added support for HTML inline images ( Commit: 5d19cae |
Co-authored-by: mapitman <4205286+mapitman@users.noreply.github.com>
mapitman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Tested locally and was able to render images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds automatic conversion of relative image paths to base64-encoded data URIs in both markdown image syntax and HTML inline images. This solves the issue where relative image paths break when HTML is rendered in a temporary directory, which is common with GitHub repository markdown files.
Key Changes
- Token processing system that walks parsed markdown tokens to find and convert relative image paths to base64 data URIs
- HTML image processing using regex to handle
<img src="...">tags with relative paths - Security measures including 10MB file size limit and path traversal validation with up to 3 parent directory levels
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return true | ||
| } | ||
|
|
||
| // imageToDataURI reads an image file and converts it to a base64 data URI |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function comment should specify what happens when image conversion fails. Consider adding that the function returns an empty string on failure, which causes the original path to be preserved (graceful degradation).
| // imageToDataURI reads an image file and converts it to a base64 data URI | |
| // imageToDataURI reads an image file and converts it to a base64 data URI. | |
| // On any failure it logs a warning and returns an empty string so callers can | |
| // gracefully fall back to using the original image path. |
| return fmt.Sprintf("data:%s;base64,%s", mimeType, encoded) | ||
| } | ||
|
|
||
| // getMimeType returns the MIME type based on file extension |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function comment should document the return value more clearly. Consider adding that it returns the appropriate MIME type string for the image format, or "image/*" for unknown formats.
| // getMimeType returns the MIME type based on file extension | |
| // getMimeType returns the appropriate image MIME type string based on the file | |
| // extension, or "image/*" for unknown or unsupported image formats. |
| // Clean and validate the path to prevent path traversal attacks | ||
| cleanedPath, err := filepath.Abs(fullPath) | ||
| if err != nil { | ||
| log.Printf("Warning: Invalid image path %s: %v", fullPath, err) |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message could be more helpful by suggesting potential causes. Consider rephrasing to something like "Warning: Invalid image path %s (failed to resolve absolute path): %v" to clarify what operation failed.
| log.Printf("Warning: Invalid image path %s: %v", fullPath, err) | |
| log.Printf("Warning: Invalid image path %s (failed to resolve absolute path): %v", fullPath, err) |
| // Ensure the resolved path is within or relative to the base directory | ||
| cleanedBase, err := filepath.Abs(baseDir) | ||
| if err != nil { | ||
| log.Printf("Warning: Invalid base directory %s: %v", baseDir, err) |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message could be more helpful by suggesting potential causes. Consider rephrasing to something like "Warning: Invalid base directory %s (failed to resolve absolute path): %v" to clarify what operation failed.
| log.Printf("Warning: Invalid base directory %s: %v", baseDir, err) | |
| log.Printf("Warning: Invalid base directory %s (failed to resolve absolute path): %v", baseDir, err) |
| // For unknown extensions, log a warning but try with generic image type | ||
| log.Printf("Warning: Unknown image extension %s for file %s, using image/* MIME type", ext, path) | ||
| return "image/*" |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "image/*" as a fallback MIME type for unknown extensions is not standard and may not work correctly in all browsers. The asterisk is typically used in HTTP Accept headers, not in data URIs. Consider using a more specific default like "application/octet-stream" or "image/png", or simply skip conversion for unknown formats by returning an empty string.
| // For unknown extensions, log a warning but try with generic image type | |
| log.Printf("Warning: Unknown image extension %s for file %s, using image/* MIME type", ext, path) | |
| return "image/*" | |
| // For unknown extensions, log a warning but fall back to a generic binary MIME type | |
| log.Printf("Warning: Unknown image extension %s for file %s, using application/octet-stream MIME type", ext, path) | |
| return "application/octet-stream" |
| // imgSrcRegex matches <img> tags with src attributes | ||
| // Captures: 1=prefix, 2=opening quote, 3=src path, 4=closing quote | ||
| var imgSrcRegex = regexp.MustCompile(`(<img[^>]*\ssrc=)(["']?)([^"'\s>]+)(["']?)`) |
Copilot
AI
Dec 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern for src paths uses [^"'\s>]+ which excludes spaces. This means unquoted src attributes containing spaces (e.g., <img src=my image.png>) would only capture the first part before the space ("my"), potentially causing incorrect path processing. While unquoted attributes with spaces are invalid HTML, consider adding validation to ensure src attributes are properly quoted, or update the regex to only match properly quoted or space-free unquoted values.
| // imgSrcRegex matches <img> tags with src attributes | |
| // Captures: 1=prefix, 2=opening quote, 3=src path, 4=closing quote | |
| var imgSrcRegex = regexp.MustCompile(`(<img[^>]*\ssrc=)(["']?)([^"'\s>]+)(["']?)`) | |
| // imgSrcRegex matches <img> tags with quoted src attributes | |
| // Captures: 1=prefix, 2=opening quote, 3=src path, 4=closing quote | |
| var imgSrcRegex = regexp.MustCompile(`(<img[^>]*\ssrc=)(["'])([^"']+)(["'])`) |

Relative image paths in markdown (e.g.,
./images/logo.png,../assets/icon.png) break when the HTML is rendered in a temp directory. This is common with GitHub repository markdown files that use both markdown image syntax and HTML inline images.Changes
Imagetokens with relative pathsHTMLInlineandHTMLBlocktokens to handle<img src="...">tags with relative pathsSecurity
Example
Markdown syntax:
HTML inline images:
Both markdown and HTML relative paths are embedded as
data:image/png;base64,iVBORw0KG...while external URLs remain unchanged.Testing
Tested with real-world examples including the deb-get README.md which uses extensive HTML inline images.
Markdown images:
HTML inline images (deb-get README):
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.