Nick Sweeting c1fd2cfa42 tag URLs immediately once added instead of waiting until archival completes 2 years ago
..
__init__.py f0033f75d0 config.py lint fixes 2 years ago
archive_org.py bd6d9c165b enforce utf8 on literally all file operations because windows sucks 4 years ago
dom.py 603ce7ec10 After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2 years ago
favicon.py 1e50ca243e Add FAVICON_PROVIDER option for custom favicon service 2 years ago
git.py 5420903102 Refactor `should_save_extractor` methods to accept `overwrite` parameter 5 years ago
headers.py 5420903102 Refactor `should_save_extractor` methods to accept `overwrite` parameter 5 years ago
htmltotext.py 310b4d1242 Add htmltotext extractor 2 years ago
media.py b864c38d9e Don't be strict on unicode errors 3 years ago
mercury.py acb932ba12 improve readability and mercury error handling and fix output path to be relative 5 years ago
pdf.py 603ce7ec10 After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2 years ago
readability.py c1fd2cfa42 tag URLs immediately once added instead of waiting until archival completes 2 years ago
screenshot.py 603ce7ec10 After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2 years ago
singlefile.py d77c770c47 add CHROME_TIMEOUT args 3 years ago
title.py db2984e47b prefer dom dump to singlefile for generating readability output 2 years ago
wget.py a9986f1f05 add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support 4 years ago