Save some URLs now -> [Add page]
Import URLs from a browser -> [Import page]
Import URLs from a 3rd party bookmarking service -> [Sync page]
Archive URLs on a schedule -> [Schedule page]
Archive an entire website -> [Crawl page]
Crawl for URLs with a search engine and save automatically
Some URLs on a schedule
Save an entire website (e.g. https://example.com)
Save results matching a search query (e.g. "site:example.com")
Save a social media feed (e.g. https://x.com/user/1234567890)
Archive URLs on a schedule -> [Schedule page]
Choose Schedule check for new URLs: Schedule.objects.get(pk=xyz)
1 day
Choose Destination Crawl to archive URLs using : Crawl.objects.get(pk=xyz)
Add a new source to pull URLs in from (WIZARD)
Provide credentials for the source
Create a new alert, choose condition
Choose alert threshold:
Choose how to notify: (List[AlertDestination])
Choose scope:
Choose ArchiveResult scope (extractors): (a query that returns ArchiveResult.objects QuerySet)
Choose Snapshot scope (URL): (a query that returns Snapshot.objects QuerySet)
Choose crawl scope: (a query that returns Crawl.objects QuerySet)
class AlertDestination(models.Model):
destination_type: [email, slack, webhook, google_sheet, local logfile, b2/s3/gcp bucket, etc.]
maximum_frequency
filter_rules
credentials
alert_template: JINJA2 json/text template that gets populated with alert contents