-
Notifications
You must be signed in to change notification settings - Fork 2
Feature/backfill historical featured quotes #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Agamya-Samuel
wants to merge
25
commits into
indictechcom:main
Choose a base branch
from
Agamya-Samuel:feature/backfill-historical-featured-quotes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Feature/backfill historical featured quotes #9
Agamya-Samuel
wants to merge
25
commits into
indictechcom:main
from
Agamya-Samuel:feature/backfill-historical-featured-quotes
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…y directly from the WikiQuote API. Updated the function name for clarity and added detailed documentation explaining the rationale behind this approach. The new implementation ensures users receive the latest quote based on UTC time, accommodating different timezones. If the quote is not found in the database, it will be added automatically.
…ng of quote entries
…king of quote entries. - Updated the add_quote_to_db function to set these fields to the current UTC time upon creation. This change enhances the database schema and ensures accurate timestamps for each quote entry.
…g and connection status updates
…hance structure. - Updated Quote and QuoteCreate schemas for clarity and maintainability.
… and update docstring for improved understanding of the API behavior.
…chema for consistency
… for improved timestamp tracking. - Updated the extract_quote function to include created_at and updated_at fields, both set to the current UTC time. This change enhances the data model by providing accurate timestamps for when quotes are created and updated, ensuring better tracking and management of quote entries.
…th error handling
…in storing longer content
… environment variable management
- Created core functionality to process and extract quotes from Wikiquote. - Added configuration file for quote URLs by year and month. - Implemented main application entry point and asynchronous processing of quote URLs. - Introduced utility functions for loading configuration, validating URLs, and parsing quote data. - Set up logging for better debugging and error tracking. This commit lays the foundation for the Historical Featured Quotes App, enabling the extraction and storage of quotes from specified URLs.
…ion and timestamp management
…ured date for improved uniqueness
…ons and configuration details
…ions and features for historical quotes population
Merge pull request indictechcom#6 from Agamya-Samuel/feature/db-integration
…/github.com/Agamya-Samuel/wiki-featured-content-feed into feature/backfill-historical-featured-quotes
- Introduced API_HEADERS to include User-Agent and Accept headers, preventing 403 errors during API requests. - Added extract_quotes_from_api_response function to process MediaWiki API responses, extracting quotes and determining the year from the title or HTML content. - Improved error handling for API responses, including status code checks and JSON parsing validation.
- Improved create_multiple_quotes function to handle string date conversion, duplicate checking, and error logging. - Added detailed logging for created, skipped, and errored quotes during batch processing. - Refactored quote processing in backfill_historical_featured_quotes_app to ensure proper session management and error handling. - Ensured that missing fields in quotes are handled gracefully before database insertion.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Summary
Reference: #5
Historical Quote Backfill Tool
Summary
This PR adds a utility application to backfill historical quotes from Wikiquote's archives (dating back to 2007) into the Quote of the Day database. The tool handles multiple HTML formats from different time periods, processes quotes concurrently using asyncio, and ensures proper date formatting.
Features
Implementation Details
Testing