Skip to content

Conversation

@kaifcodec
Copy link
Owner

@kaifcodec kaifcodec commented Dec 25, 2025

  • This PR adds two sites under the new email_scan directory.
  • It mainly serves as a proof of concept, showing how these sites send requests and how we can build and extend the helper functions going forward.

Task Checklist: (#123)

  • Create new directory structure under user_scanner/email_scan/
  • Implement email-specific scanning methods and modules
  • Update user_scanner/core/orchestrator.py to support email validation logic
  • Add -e / --email argument to main.py (Ensure -e is mutually exclusive with -p and other username-specific arguments)
  • Tweak Result and Printer classes for email-specific output
  • Add initial email validation functions (e.g., check_)
  • Move existing site modules into the new username_scan structure
  • Improve code quality and isolate both the scan methods
  • Update the CONTRIBUTING.md : proper guide about how to contribute to email_scan
  • Update the README

@kaifcodec kaifcodec linked an issue Dec 25, 2025 that may be closed by this pull request
7 tasks
@VamatoHD
Copy link
Collaborator

VamatoHD commented Dec 26, 2025

@kaifcodec I didn't have time to fully implement the email logic.

@kaifcodec
Copy link
Owner Author

kaifcodec commented Dec 27, 2025

@VamatoHD I reviewed the changes you made.
One thing I noticed is that you removed these lines from __main__.py

   if not args.username:
        parser.print_help()
        return

Is there any specific reason to remove that? Maybe some implementations later?

I would change it to

    if not (args.username or args.email):
        parser.print_help()
        return

Overall the start is good!

@VamatoHD
Copy link
Collaborator

@kaifcodec I removed it since I thought the following code would replace it:

group = parser.add_mutually_exclusive_group(required=True)
group.add_argument(
        "-u", "--username",  help="Username to scan across platforms")
group.add_argument(
        "-e", "--email",  help="Email to scan across platforms"
)

However, it doesn't print the help menu.

@kaifcodec
Copy link
Owner Author

But I think the help menu is not that much huge right now, so maybe it's better to print the help menu if no arguments provided.
So I am restoring it for now and I will added the --version flag in the next commit.

@VamatoHD
Copy link
Collaborator

What if instead of repeatedly passing the argument printer, last, is_email into the following functions:

def run_checks(username: str, printer: Printer, last: bool = True, is_email: bool = False) -> List[Result]:
def run_checks_category(category_path: Path, username: str, printer: Printer, last: bool = True, is_email: bool = False) -> List[Result]:
def run_module_single(module, username: str, printer: Printer, last: bool = True, is_email: bool = False) -> List[Result]:

I make a class that holds those values and is passed instead?

class RunContext:
   def __init__(self, printer, is_last, is_email):
      ...
def run_checks(username: str, ctx: RunContext) -> List[Result]:
def run_checks_category(category_path: Path, username: str, ctx: RunContext) -> List[Result]:
def run_module_single(module, username: str, ctx: RunContext) -> List[Result]:

@kaifcodec
Copy link
Owner Author

Yeah it makes it clean, but will not it also make it a bit complex?
Are you going create the objects like RunContext.is_email(True)?

@VamatoHD
Copy link
Collaborator

It wouldn't be complex; it's just:

RunContext(printer, is_last, is_email)

Also, what if we revert the printing to always be the console format and only apply formatting on the output file? It would simply the code a lot, and is_last and printer would probably be removed with it.

@kaifcodec
Copy link
Owner Author

Then what are you planning to print on the console when a format is selected?
Maybe a placeholder is required!

Anyways I am not seeing the full picture in my mind right now.
It would be better if you push the commits so I can review and understand what you are talking about.
Then I would be able to choose whether to change something or keep what you did.
However it sounds good. Go on!

@kaifcodec kaifcodec changed the title PoC: initial commit for email osint mode Feature: Add email osint mode Dec 27, 2025
@VamatoHD
Copy link
Collaborator

Basically I reverted to the state before the implementation of Printer; this way the output stays the same as when it was with console and the logic is way cleaner. It clearly needs more improvements, especially with the testing.
If you dislike it, you can always revert the commit.

@kaifcodec
Copy link
Owner Author

kaifcodec commented Dec 28, 2025

I saw this one but, there is an issue, you might have overlooked the json formating in formatter.py.
Right now it completely ignores the -f json flag and prints console output and don't write the results in json.

However CSV output works perfectly.

@VamatoHD
Copy link
Collaborator

My idea was to revert the printing to console mode and just apply the formatting (CSV or JSON) to the output file. This improves readability since the arguments for the "run" functions are reduced.

Also, the formatted file appears to be formatted correctly.

@kaifcodec
Copy link
Owner Author

Oh, Yeah, It's working (the json output), I had mistakenly used -f csv but was expecting json.

@kaifcodec
Copy link
Owner Author

@VamatoHD I’ve finished most of the core logic changes, there's still many bugs I am fixing them one by one.
At this point, we mainly just need to keep adding more sites. Since this is an OSINT mode, we don’t really need to worry about whether a site is popular or not, we can add whatever platforms we come across.

@VamatoHD
Copy link
Collaborator

VamatoHD commented Jan 2, 2026

I'll work on improving the code.
Also, we could implement a test for each module where a known email /username is tested.

@kaifcodec
Copy link
Owner Author

There are still a few minor issues to address, for example:

  • I’m currently using the existing Result.available() and Result.taken() even for email scans. This was mainly to get everything working quickly so we could see the full flow of the tool in action. We can clean this up later with better naming and general code quality improvements.

  • Right now, it prints all outputs (registered and not registered) by default. This is where the -v / --verbose flag can come into play. I’m thinking it would be better if we use verbose mode for email scans so that Not Registered results are only shown when -v is used.

  • Add a final clean, boxed summary output showing where the email was found to be registered.

@VamatoHD
Copy link
Collaborator

VamatoHD commented Jan 2, 2026

Is it on purpose for the email scanner to not show headers like username scanner does? Like showing == Creator SITES ==. If so, it could also be applied to the username scanner, as it would clean up the logic.

@kaifcodec
Copy link
Owner Author

I am implementing to show the headers, I even have finished the changes in email_orchestrator.py.

@kaifcodec
Copy link
Owner Author

By the way, I’m not planning to touch user_scans at all. I want to keep it isolated and unchanged for now since it’s working perfectly. Let’s finish this PR first, and we can think about revisiting or improving that part sometime later.

That said, are you planning to work on this next—specifically improving the code quality and handling the renaming of classes/objects?

@VamatoHD
Copy link
Collaborator

VamatoHD commented Jan 2, 2026

It may be my perfectionism, but, while user_scanner checks each user individually and prints each module for that user, the email scanner checks every username permutation in a module before checking the next:

  [✘] X (johndoea@gmail.com): Registered
  [✘] X (johndoe@gmail.com): Registered
  [✘] X (johndoeb@gmail.com): Registered
  [✔] Instagram (johndoeb@gmail.com): Not Registered
  [✔] Mastodon (johndoeb@gmail.com): Not Registered
  [✔] Mastodon (johndoea@gmail.com): Not Registered```

@VamatoHD
Copy link
Collaborator

VamatoHD commented Jan 2, 2026

That said, are you planning to work on this next—specifically improving the code quality and handling the renaming of classes/objects?

I'm still busy with a school project, but, in my free time, I'll refactor the __main__.py code and probably reuse the same asyncio client for every email check.
Since every email check is unique, I won't waste time trying to implement DRY.

@kaifcodec
Copy link
Owner Author

kaifcodec commented Jan 2, 2026

Oh yeah, I see that’s an issue and it definitely should be fixed.

If you want to work on it, I can leave it as-is for now. Same here, I’ll be extremely busy for the next 20–25 days (can't even touch codes), which is why I pushed all of this together, to give a complete picture of the direction we should take and how things are supposed to flow.

I'm still busy with a school project, but, in my free time, I'll refactor the main.py code and probably reuse the same asyncio client for every email check.
Since every email check is unique, I won't waste time trying to implement DRY.

Take your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feat: Email OSINT feature

3 participants