Nick Sweeting
|
c1fd2cfa42
tag URLs immediately once added instead of waiting until archival completes
|
1 рік тому |
Nick Sweeting
|
78d942ac22
show more detail in readabiliity error messages
|
1 рік тому |
Nick Sweeting
|
5b07a1126c
add comment about why DOM is preferred over singlefile for readability parsing
|
1 рік тому |
Nick Sweeting
|
2c54e55697
prefer dom dump to singlefile for generating readability output
|
1 рік тому |
Nick Sweeting
|
82d8662c74
add more readability error output
|
2 роки тому |
prnake
|
011bd104cb
remove unused import
|
3 роки тому |
papersnake
|
de8e22efb7
improve title extractor
|
3 роки тому |
Nick Sweeting
|
eb4d3bca9d
Update readability.py
|
4 роки тому |
Nick Sweeting
|
a9986f1f05
add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support
|
4 роки тому |
Nick Sweeting
|
bd6d9c165b
enforce utf8 on literally all file operations because windows sucks
|
4 роки тому |
Nick Sweeting
|
acb932ba12
improve readability and mercury error handling and fix output path to be relative
|
4 роки тому |
Nick Sweeting
|
d0f8a5e710
change mercury atomic_write output order
|
4 роки тому |
Dan Arnfield
|
5420903102
Refactor `should_save_extractor` methods to accept `overwrite` parameter
|
4 роки тому |
JDC
|
b1f70b2197
Initial implementation
|
5 роки тому |
Nick Sweeting
|
a645f36b87
add comment about fake cmd
|
5 роки тому |
Cristian
|
66037535fd
feat: Add curl command on readability as default command to debug
|
5 роки тому |
Cristian
|
bf3ea42141
fix: Add a default cmd value to handle case where the html cannot be retrieved
|
5 роки тому |
Nick Sweeting
|
a2c158e43e
catch OSErrors due to missing path
|
5 роки тому |
Nick Sweeting
|
7144e0bdce
search for node dependencies in output dir first
|
5 роки тому |
Nick Sweeting
|
92de20af15
better detect missing dependencies on startup
|
5 роки тому |
Cristian
|
05c71fc302
fix: Organize readability extractor so a timeout does not break the whole process
|
5 роки тому |
Nick Sweeting
|
03b73bfe77
Update archivebox/extractors/readability.py
|
5 роки тому |
Cristian
|
5dc7e63792
feat: Update dockerfile to support readability
|
5 роки тому |
Cristian
|
2a68af1b94
tests: Add readability tests
|
5 роки тому |
Cristian
|
8aa7b34de7
tests: Add readability to ignored methods in tests
|
5 роки тому |
Cristian
|
dc87d8b68c
tests: Update failing tests
|
5 роки тому |
Cristian
|
0ec747f64e
feat: Look in wget, singlefile or dom outputs before attempting to download the information again
|
5 роки тому |
Cristian
|
a14762640e
feat: Avoid running readability when the target is a file
|
5 роки тому |
Cristian
|
61e08a7c43
docs: Update docs link
|
5 роки тому |
Cristian
|
b33c66a9f7
feat: Split output of readability into multiple files
|
5 роки тому |