| .. |
|
__init__.py
|
f0033f75d0
config.py lint fixes
|
2 years ago |
|
archive_org.py
|
bd6d9c165b
enforce utf8 on literally all file operations because windows sucks
|
4 years ago |
|
dom.py
|
603ce7ec10
After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.
|
2 years ago |
|
favicon.py
|
1e50ca243e
Add FAVICON_PROVIDER option for custom favicon service
|
2 years ago |
|
git.py
|
5420903102
Refactor `should_save_extractor` methods to accept `overwrite` parameter
|
5 years ago |
|
headers.py
|
5420903102
Refactor `should_save_extractor` methods to accept `overwrite` parameter
|
5 years ago |
|
htmltotext.py
|
310b4d1242
Add htmltotext extractor
|
2 years ago |
|
media.py
|
b864c38d9e
Don't be strict on unicode errors
|
3 years ago |
|
mercury.py
|
acb932ba12
improve readability and mercury error handling and fix output path to be relative
|
5 years ago |
|
pdf.py
|
603ce7ec10
After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.
|
2 years ago |
|
readability.py
|
c1fd2cfa42
tag URLs immediately once added instead of waiting until archival completes
|
2 years ago |
|
screenshot.py
|
603ce7ec10
After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.
|
2 years ago |
|
singlefile.py
|
d77c770c47
add CHROME_TIMEOUT args
|
3 years ago |
|
title.py
|
db2984e47b
prefer dom dump to singlefile for generating readability output
|
2 years ago |
|
wget.py
|
a9986f1f05
add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support
|
4 years ago |