Skip to content

Conversation

@RamGcia
Copy link
Contributor

@RamGcia RamGcia commented Jan 15, 2026

Expanded the README to include detailed project features, setup instructions, usage examples, and notes on customization and limitations.

Expanded the README to include detailed project features, setup instructions, usage examples, and notes on customization and limitations.
@github-actions
Copy link

github-actions bot commented Jan 15, 2026

Pull Request Ethics, Security & Compliance Checklist
(Mandatory for all pull requests. This helps you develop professional habits required in industry and academia.)

  1. Nature of the contribution (select all that apply)

    • Purely non-functional changes (documentation, formatting, tests, CI/CD, refactoring without behavioral change)
    • Introduction or modification of cryptographic functions or security-critical code
    • Collection, storage, transmission, or processing of personal data (PII, health data, biometric data, location, etc.)
    • Use or processing of data belonging to children under 13 (or reasonably likely to include such data)
    • Implementation or modification of AI/ML models (training, fine-tuning, inference, prompt engineering)
    • Code interacting with external networks, APIs, or third-party services
  2. Security & responsible practice assessment

    • No security, privacy, or ethical implications identified
    • Potential security or privacy implications present (e.g., input validation, error handling, data exposure)
    • High-risk implications (potential for unauthorized access, data leakage, bias, or legal non-compliance)
  3. Relevant laws, regulations, and frameworks considered (list all that may apply, or if none, refer to content in D2L Ethics module)
    Examples:

    • Australian Privacy Act 1988 (Cth) & Privacy Amendment (Notifiable Data Breaches) Act 2017
    • GDPR (EU), CCPA/CPRA (California), COPPA (US children’s privacy)
    • University Human Research Ethics requirements (HREC/NHRMC)
    • NIST Cybersecurity Framework / OWASP Top 10 / ASVS
    • ISO/IEC 27001, NIST AI Risk Management Framework (AI RMF)
    • Export controls (ITAR, EAR, Australian Defence Export Controls)
  4. Security and responsible engineering measures implemented (select and describe)

    • Yes — specify below (e.g., input sanitization, prepared statements, least-privilege access, data minimization, anonymization, secure defaults, dependency scanning, bias audit, consent mechanism, etc.)
    • Partially implemented — further review recommended
    • No — measures appear necessary
    • Not applicable
  5. Additional notes (optional)
    Any references to secure coding guidelines followed (e.g., ISO27001, OWASP Secure Coding Practices, university security standards), vulnerability scans performed, or ethical review status.

By submitting this pull request, I confirm I have considered the security, privacy, and ethical implications of my contribution in accordance with university policy and applicable legal and professional standards.

Thank you for helping maintain a secure and responsible codebase.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the Asset Scanner README documentation to provide comprehensive information about features, setup, usage, and customization. The documentation update transforms a basic README into a well-structured guide for users and contributors.

Changes:

  • Expanded project description with detailed feature list including hybrid detection, OCR capabilities, and risk assessment
  • Restructured documentation with improved formatting using tables, sections, and examples
  • Added comprehensive CLI documentation, customization guides, and example outputs
  • Included contributing guidelines and references to project policies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

| `--root` | Path | Recursively scan all files in a directory. | `python scanner.py --root "/path/to/"` |
| `--patterns` | Path | Custom path to `patterns.json`. | `--patterns ./configs/patterns.json` |
| `--out` | Path | Path to save structured scan results (e.g., `.json`, `.txt`). | `--out results.json` |
| `--ext` | List | Filter by file extensions (_default: .txt, .docx, .pdf_). | `--ext .txt .md` |
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documented default extensions for --ext are incorrect. According to scanner.py line 185, the actual default is ['.txt', '.json'], not .txt, .docx, .pdf. The scanner.py line 51 defines DEFAULT_TARGET_EXTS with a broader list, but these are not the defaults for the --ext parameter.

Suggested change
| `--ext` | List | Filter by file extensions (_default: .txt, .docx, .pdf_). | `--ext .txt .md` |
| `--ext` | List | Filter by file extensions (_default: .txt, .json_). | `--ext .txt .md` |

Copilot uses AI. Check for mistakes.
Comment on lines +138 to +139
"tip": "Rotate immediately; revoke if exposed.",
"compliance": ["GDPR Art. 33 — Data Breach Notification"]
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example compliance reference is incomplete. According to risk_rules.json lines 14-19, the aws_access_key actually includes multiple compliance references: 'Privacy Act 1988 (Cth) — APP 11', 'Privacy Act 1988 (Cth) — Notifiable Data Breaches (NDB) scheme, Part IIIC', and 'GDPR Art. 32 — Security of processing'. The tip should also be 'Rotate immediately; revoke if exposed; move to a secrets manager; purge from history.'

Copilot uses AI. Check for mistakes.
"file": "example.docx",
"line": 1,
"match": "john.doe@example.com",
"risk": "Medium",
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example shows email risk as 'Medium', but according to risk_rules.json line 3, emails are classified as 'Low' risk, not 'Medium'.

Suggested change
"risk": "Medium",
"risk": "Low",

Copilot uses AI. Check for mistakes.
## 🌟 Contributing

We welcome contributions! Fork the repository, create a feature branch, and submit a pull request.
Please adhere to our [Code of Conduct](CODE_OF_CONDUCT.md).
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README references a CODE_OF_CONDUCT.md file, but this file does not exist in the asset-scanner directory. This will result in a broken link for users viewing the asset-scanner README.

Suggested change
Please adhere to our [Code of Conduct](CODE_OF_CONDUCT.md).
Please adhere to our [Code of Conduct](../CODE_OF_CONDUCT.md).

Copilot uses AI. Check for mistakes.
Comment on lines +213 to +214
This project is licensed under the [MIT License](LICENSE).
See the `LICENSE` file for full details.
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README references a LICENSE file with a relative link, but no LICENSE file exists in the asset-scanner directory. This will result in a broken link for users viewing the asset-scanner README.

Suggested change
This project is licensed under the [MIT License](LICENSE).
See the `LICENSE` file for full details.
This project is licensed under the MIT License.
See the `LICENSE` file in the repository root for full details.

Copilot uses AI. Check for mistakes.
| `--patterns` | Path | Custom path to `patterns.json`. | `--patterns ./configs/patterns.json` |
| `--out` | Path | Path to save structured scan results (e.g., `.json`, `.txt`). | `--out results.json` |
| `--ext` | List | Filter by file extensions (_default: .txt, .docx, .pdf_). | `--ext .txt .md` |
| `--no-console`| Flag | Suppress console output. Only write to the output file. | `--no-console` |
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --no-console flag is documented in the CLI table, but this argument does not exist in scanner.py's argument parser (lines 173-190). This feature is not implemented.

Suggested change
| `--no-console`| Flag | Suppress console output. Only write to the output file. | `--no-console` |

Copilot uses AI. Check for mistakes.
## 📝 Example
Scanning a document containing:
## 🥼 Example
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emoji '🥼' (lab coat) is unusual for an 'Example' section. Consider using a more standard emoji like '📝' (memo) or '💡' (light bulb) for consistency with other section headers.

Suggested change
## 🥼 Example
## 📝 Example

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants