Nick Sweeting
|
17f40f3ada
Merge branch 'dev' into fix-URL_REGEX
|
1 anno fa |
Nick Sweeting
|
c6f8a33a63
Update util.py
|
1 anno fa |
longzai
|
e4dc2701ef
fix URL_REGEX 2
|
1 anno fa |
longzai
|
4ae765ec27
fix the URL_REGEX used in generic_html parsers
|
1 anno fa |
Nick Sweeting
|
c5bb99dce1
explicitly use Default profile inside user data dir
|
1 anno fa |
Nick Sweeting
|
ca2c484a8e
Add `_EXTRA_ARGS` for various extractors (#1360)
|
1 anno fa |
Ben Muthalaly
|
d8cf09c21e
Remove unnecessary variable length args for dedupe
|
1 anno fa |
Ben Muthalaly
|
4686da91e6
Fix cookies being set incorrectly
|
1 anno fa |
Ben Muthalaly
|
d74ddd42ae
Flip dedupe precedence order
|
1 anno fa |
Ben Muthalaly
|
68326a60ee
Add cookies file to http request in `download_url`
|
1 anno fa |
Ben Muthalaly
|
4d9c5a7b4b
Add `CHROME_EXTRA_ARGS`
|
1 anno fa |
Ben Muthalaly
|
4e69d2c9e1
Add `EXTRA_*_ARGS` for wget, curl, and singlefile
|
1 anno fa |
Nick Sweeting
|
6a4e568d1b
new archivebox update speed improvements
|
1 anno fa |
Nick Sweeting
|
8c07b7e127
disable automatic chrome selfupdating
|
1 anno fa |
Nick Sweeting
|
6184f659dc
improve window size chrome cli handling
|
1 anno fa |
spresse1
|
603ce7ec10
After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file.
|
2 anni fa |
Ross Williams
|
c039ef05b3
Fix hyphen placement in util.URL_REGEX
|
2 anni fa |
Ross Williams
|
d0e65eba7f
More reliably detect Google Chrome version number
|
2 anni fa |
ふぁ
|
44a5a5ed7e
add explicitly specify --headless=new
|
2 anni fa |
ふぁ
|
d77c770c47
add CHROME_TIMEOUT args
|
2 anni fa |
Nick Sweeting
|
606fa397a4
disable passing timeout arg to chrome because v111 is crashing when passed
|
2 anni fa |
Nick Sweeting
|
1f1c70a8b1
remove --single-process from chrome args and add some rendering optimization args
|
2 anni fa |
Nick Sweeting
|
49faec8f6d
add no-zygote and single-process args to try and prevent orphan chrome processes after exit
|
4 anni fa |
Nick Sweeting
|
9f05cf8283
virtual-time-budget doesnt work with some chrome stuff
|
4 anni fa |
Nick Sweeting
|
0c321a06d0
hide scrollbars in screenshots
|
4 anni fa |
Nick Sweeting
|
a9986f1f05
add timezone support, tons of CSS and layout improvements, more detailed snapshot admin form info, ability to sort by recently updated, better grid view styling, better table layouts, better dark mode support
|
4 anni fa |
Nick Sweeting
|
5a9f27204a
dont use chrome when its not available on windows systems
|
4 anni fa |
Nick Sweeting
|
3e26ae4a66
support finding multiple urls as substrings in text
|
4 anni fa |
Nick Sweeting
|
c089501073
add response status code to headers.json
|
4 anni fa |
Nick Sweeting
|
a0a79cead8
move utils and vendored libs into subfolders
|
5 anni fa |