Commit Graph

46 Commits

Author SHA1 Message Date
Michael Mintz d65c52cd18 Make multiple updates 2024-09-23 00:49:23 -04:00
Michael Mintz b0909b3bec Improve error-handling for "by"-selection 2024-08-06 23:15:38 -04:00
Michael Mintz 5af9e02955 Fix link-checking methods 2023-10-29 01:45:58 -04:00
Michael Mintz 44ca569bf7 Add support methods into "driver" instances 2023-09-09 23:20:10 -04:00
Michael Mintz f693fd13cc Drop support for Python 3.6 and Opera 2023-09-01 18:36:15 -04:00
Michael Mintz 7c3e34b3fe Refactor downloading 2023-08-18 18:50:01 -04:00
Michael Mintz afcc63ba7b Add "timeout" to "requests.get()" calls 2023-08-15 17:20:47 -04:00
Michael Mintz ea3788704c Refactor the code 2023-07-18 11:22:23 -04:00
Michael Mintz 27d2e44c3f Update and sort the list of allowable URL prefixes 2023-05-30 21:16:41 -04:00
Michael Mintz 5bc620942a Update the code that returns unique links 2023-03-09 14:07:11 -05:00
Michael Mintz e2502f8f7d Add more methods for saving and reading file data 2023-03-06 11:11:27 -05:00
Michael Mintz e6d35b478e Refactoring 2023-02-08 22:37:26 -05:00
Michael Mintz 5fd88b1fd8 Make improvements to multithreaded downloads 2023-02-08 22:34:43 -05:00
Michael Mintz 2516a8309b Refactoring 2023-02-03 01:10:36 -05:00
Michael Mintz f03c595fd1 Use requests.head() over requests.get() when possible 2022-04-22 02:59:09 -04:00
Michael Mintz 1e40f8359f Update a method that returns a unique selector 2022-03-15 19:52:12 -04:00
Michael Mintz 857f86014f Ignore certificate errors by default in get_link_status_code() 2022-02-11 15:37:45 -05:00
Michael Mintz 7936b1fc79 Code optimization and refactoring 2022-01-11 22:43:03 -05:00
Michael Mintz 82116af3c3 Add "&" as a shortcut for a single-syllable "name" selector 2021-08-23 23:03:39 -04:00
Michael Mintz 54d7804b10 Update methods for getting the link text from selectors 2021-08-23 23:01:35 -04:00
Michael Mintz db91e6c2fa Add additional shortcuts for partial_link_text selectors 2021-08-23 22:59:52 -04:00
Michael Mintz 093c1063cd Optimize code with syntax refactoring 2021-05-03 22:38:13 -04:00
Michael Mintz a203c77d66 Fix a method that finds all unique links on a page 2021-03-04 16:43:57 -05:00
Michael Mintz f26bdeac39 Improve error-handling 2020-08-14 20:30:55 -04:00
Michael Mintz c8059bf774 Fix UnicodeDecodeError issues on Windows 2020-07-19 14:18:02 -04:00
Michael Mintz 2a35c266e4 Update URL detection 2020-05-25 04:22:58 -04:00
Michael Mintz 9d76cc9c8c Fix assert_no_404_errors() method 2019-10-13 09:04:54 -04:00
Michael Mintz e0ed83a07c Optimize selector detection and usage 2019-07-21 14:10:41 -04:00
Michael Mintz c1adb67742 Fix link formation 2019-04-22 03:38:54 -04:00
Michael Mintz 8ffd3a256c Ignore parsing urls that don't have :// in them (ex: about:blank) 2019-04-01 11:15:26 -04:00
Michael Mintz 420e845d94 Add methods for getting status codes from links 2019-04-01 02:06:31 -04:00
Michael Mintz 4d52f879f8 flake8 fixes 2018-11-05 00:23:53 -05:00
Michael Mintz cb06267c50 Refactor js_utils from page_utils 2018-11-02 19:06:45 -04:00
Michael Mintz 64cb4a50de Major refactoring to organize methods 2018-10-10 03:35:11 -04:00
Michael Mintz f55e55dcff Quotes need to be properly escaped before Javascript calls 2018-08-30 01:50:24 -04:00
Michael Mintz 2e29316c6d Add save_data_as() method 2018-08-19 15:05:46 -04:00
Michael Mintz 06f9286f02 Use built-in Python methods when possible 2018-04-02 21:18:36 -04:00
Michael Mintz 629ebd97e6 Add some special URLs as valid URLs 2018-02-26 18:22:15 -05:00
Michael Mintz 84ca2d376e Add new way of extracting link text from a selector 2018-02-25 20:34:43 -05:00
Michael Mintz d687fcc96b Add a method to determine if a URL is valid 2018-02-13 04:48:05 -05:00
Michael Mintz e53f044911 Python 3 compatibility (special characters) 2017-07-19 18:46:55 -04:00
Michael Mintz 5231a7e23e Add a method to determine if a selector is an xpath selector 2017-04-08 17:45:43 -04:00
Michael Mintz c843ef3952 Add _download_file_to() to page_utils.py 2016-06-18 21:27:52 -04:00
Michael Mintz 83a34364b3 Update comments 2016-05-30 16:57:46 -04:00
Michael Mintz 94eecbd714 Add a method to extract the domain url from a full url 2016-05-20 23:52:53 -04:00
Michael Mintz 11067cf1b1 Fresh Copy 2015-12-04 16:11:53 -05:00