longzai
|
e4dc2701ef
fix URL_REGEX 2
|
vor 1 Jahr |
longzai
|
4ae765ec27
fix the URL_REGEX used in generic_html parsers
|
vor 1 Jahr |
spresse1
|
603ce7ec10
After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.
|
vor 2 Jahren |
Ross Williams
|
c039ef05b3
Fix hyphen placement in util.URL_REGEX
|
vor 2 Jahren |
Ross Williams
|
d0e65eba7f
More reliably detect Google Chrome version number
|
vor 2 Jahren |
ふぁ
|
44a5a5ed7e
add explicitly specify --headless=new
|
vor 2 Jahren |
ふぁ
|
d77c770c47
add CHROME_TIMEOUT args
|
vor 2 Jahren |
Nick Sweeting
|
606fa397a4
disable passing timeout arg to chrome because v111 is crashing when passed
|
vor 2 Jahren |
Nick Sweeting
|
1f1c70a8b1
remove --single-process from chrome args and add some rendering optimization args
|
vor 2 Jahren |
Nick Sweeting
|
49faec8f6d
add no-zygote and single-process args to try and prevent orphan chrome processes after exit
|
vor 4 Jahren |
Nick Sweeting
|
9f05cf8283
virtual-time-budget doesnt work with some chrome stuff
|
vor 4 Jahren |
Nick Sweeting
|
0c321a06d0
hide scrollbars in screenshots
|
vor 4 Jahren |
Nick Sweeting
|
a9986f1f05
add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support
|
vor 4 Jahren |
Nick Sweeting
|
5a9f27204a
dont use chrome when its not available on windows systems
|
vor 4 Jahren |
Nick Sweeting
|
3e26ae4a66
support finding multiple urls as substrings in text
|
vor 4 Jahren |
Nick Sweeting
|
c089501073
add response status code to headers.json
|
vor 4 Jahren |
Nick Sweeting
|
a0a79cead8
move utils and vendored libs into subfolders
|
vor 5 Jahren |
Nick Sweeting
|
104553489f
remove redundant utils file
|
vor 5 Jahren |
Nick Sweeting
|
83693a5c03
add packaging setup with stdeb for debian and apt
|
vor 5 Jahren |
Nick Sweeting
|
c47398851b
nicer timeout hints
|
vor 5 Jahren |
Cristian
|
62ed11a5ca
fix: Improve headers handling
|
vor 5 Jahren |
Angel Rey
|
f0915a56aa
Replaced get method
|
vor 5 Jahren |
Angel Rey
|
a8a8fd14ac
Fixed indent headers.json
|
vor 5 Jahren |
Angel Rey
|
852e3c9cff
Added headers extractor
|
vor 5 Jahren |
Cristian
|
b18bbf8874
test: Fix tests post-rebase
|
vor 5 Jahren |
apkallum
|
008769d296
add support for Paths in json encoder
|
vor 5 Jahren |
Nick Sweeting
|
3658153cf8
fix url parsing through quotes
|
vor 5 Jahren |
Cristian
|
d0d2991c69
fix: Change import that was not working
|
vor 5 Jahren |
Cristian
|
6006b4f93b
refactor: Organize code to remove flake8 issues
|
vor 5 Jahren |
Cristian
|
949f78aa65
fix: Use w3lib to improve the encoding extraction
|
vor 5 Jahren |