### Changelog

All notable changes to this project will be documented in this file. Dates are displayed in UTC.

#### [v1.0.8](https://github.com/janreges/siteone-crawler/compare/v1.0.7...v1.0.8)

- reports: changed file name composition from report.mydomain.com.* to mydomain.com.report.* [`#9`](https://github.com/janreges/siteone-crawler/pull/9)
- crawler: solved edge-case, which very rarely occurred when the queue processing was already finished, but the last outstanding coroutine still found some new URL [`a85990d`](https://github.com/janreges/siteone-crawler/commit/a85990d662d74af281805cfdf10c0320fee0007a)
- javascript processor: improvement of webpack JS processing in order to correctly replace paths from VueJS during offline export (as e.g. in case of docs.netlify.com) .. without this, HTML had the correct paths in the left menu, but JS immediately broke them because they started with an absolute path with a slash at the beginning [`9bea99b`](https://github.com/janreges/siteone-crawler/commit/9bea99b9684e6059b8abfad4b382fafdad31c9a9)
- offline export: detect and process fonts.googleapis.com/css* as CSS even if there is no .css extension [`da33100`](https://github.com/janreges/siteone-crawler/commit/da33100975635be8305e07c2023a22c300b66216)
- js processor: removed the forgotten var_dump [`5f2c36d`](https://github.com/janreges/siteone-crawler/commit/5f2c36de1666e6987d2c9d88a39e3b6d0a2e1f32)
- offline export: improved search for external JS in the case of webpack (dynamic composition of URLs from an object with the definition of chunks) - it was debugged on docs.netlify.com [`a61e72e`](https://github.com/janreges/siteone-crawler/commit/a61e72e7f5b773a437b4151432db04a5afd7124a)
- offline export: in case the URL ends with a dot and a number (so it looks like an extension), we must not recognize it as an extension in some cases [`c382d95`](https://github.com/janreges/siteone-crawler/commit/c382d959f7440ebfcd95566ec0050e771a2f3495)
- offline url converter: better support for SVG in case the URL does not contain an extension at all, but has e.g. 'icon' in the URL (it's not perfect) [`c9c01a6`](https://github.com/janreges/siteone-crawler/commit/c9c01a69905fefce82f4e8f85e707a0d1abb5e1e)
- offline exporter: warning instead of exception for some edge-cases, e.g. not saving SVG without an extension does not cause the export to stop [`9d285f4`](https://github.com/janreges/siteone-crawler/commit/9d285f4d599ba8892dd8752e8d831cd3c86af178)
- cors: do not set Origin request header for images (otherwise error 403 on cdn.sanity.io for svg, etc.) [`2f3b7eb`](https://github.com/janreges/siteone-crawler/commit/2f3b7eb51a03d42d3d2961c84aadcd118b546e05)
- best practice analyzer: in checking for missing quotes ignore values ​​longer than 1000 characters (fixes, e.g., at skoda-auto.cz the error Compilation failed: regular expression is too large at offset 90936) [`8a009df`](https://github.com/janreges/siteone-crawler/commit/8a009df9734773275fd9805862dc9bfeeccb6079)
- html report: added loading of extra headers to the visited URL list in the HTML report [`781cf17`](https://github.com/janreges/siteone-crawler/commit/781cf17c18088126db74ebc1ef00fee3d6784979)
- Frontload the report names [`62d2aae`](https://github.com/janreges/siteone-crawler/commit/62d2aae57e31c7bfa53720446cc8dfbc59e482af)
- robots.txt: added option --ignore-robots-txt (we often need to view internal or preview domains that are otherwise prohibited from indexing by search engines) [`9017c45`](https://github.com/janreges/siteone-crawler/commit/9017c45a675dd327895b57f14095ad6bd52a02fc)
- http client: adden an explicit 'Connection: close' header and explicitly calling $client-&gt;close(), even though Swoole was doing it automatically after exiting the coroutine [`86a7346`](https://github.com/janreges/siteone-crawler/commit/86a7346d059452d210b945ca4329e1cc17781dca)
- javascript processor: parse url addresses to import the JS module only in JS files (otherwise imports from HTML documentation, e.g. on the websites svelte.dev or nextjs.org, were parsed by mistake) [`592b618`](https://github.com/janreges/siteone-crawler/commit/592b618c01e75509e16a812fafab7f21f3c7c64d)
- html processor: added obtaining urls from HTML attributes that are not wrapped in quotes (but I am aware that current regexps can cause problems in the cases when are used spaces, which are not properly escaped) [`f00abab`](https://github.com/janreges/siteone-crawler/commit/f00ababfa459eca27dce7657fe91c70831f86089)
- offline url converter: swapping woff2/woff order for regex because in this case their priority is important and because of that woff2 didn't work properly [`3f318d1`](https://github.com/janreges/siteone-crawler/commit/3f318d19fa0a3757546493ac7f47cca21922b1f5)
- non-200 url basename detection: we no longer consider e.g. image generators that have the same basename and the url to the image in the query parameters as the same basename [`bc15ef1`](https://github.com/janreges/siteone-crawler/commit/bc15ef198bb13fe845fef8cd4946b2cab5c2ea6d)
- supertable: activation of automatic creation of active links also for homepage '/' [`c2e228e`](https://github.com/janreges/siteone-crawler/commit/c2e228e0d475351431cf9b060487e86ce6d33e52)
- analysis and robots.txt: improving the display of url addresses for SEO analysis in the case of a multi-domain website, so that it cannot happen that the same url, e.g. '/', is in the overview multiple times without recognizing the domain or scheme + improving the work with robots.txt in SEO detection and displaying urls banned for indexing [`47c7602`](https://github.com/janreges/siteone-crawler/commit/47c7602217e40a4f6d4f3af5c71d6dff72952aab)
- offline website exporter: we add the suffix '_' to the folder name only in the case of a typical extension of a static file - we don't want this to happen with domain names as well [`d16722a`](https://github.com/janreges/siteone-crawler/commit/d16722a5ad6271270fb0fff11e66a7f02f3b6e9a)
- javascript processor: extract JS urls also from imports like import {xy} from "./path/foo.js" [`aec6cab`](https://github.com/janreges/siteone-crawler/commit/aec6cab051a46df9d89866f5cfd7e66312dafb92)
- visited url: added 'txt' extension to looksLikeStaticFileByUrl() [`460c645`](https://github.com/janreges/siteone-crawler/commit/460c6453d91e85c2889ebaa2b2542fd88c5ffa6a)
- html processor: extract JS urls also from &lt;link href="*.js"&gt;, typically with rel="modulepreload" [`c4a92be`](https://github.com/janreges/siteone-crawler/commit/c4a92bee00d96c530431134370a3ba0d2216a1c1)
- html processor: extracting repeated calls to getFullUrl() into a variable [`a5e1306`](https://github.com/janreges/siteone-crawler/commit/a5e1306530717d9edd4f95a7989539a172a38f4a)
- analysis: do not include urls that failed to load (timeout, skipping, etc.) in the analysis of content-types and source-domains - prevention of displaying content type 'unknown' [`b21ecfb`](https://github.com/janreges/siteone-crawler/commit/b21ecfb85f58d07c0a82b93826ad2977ab2cd523)
- cli options: improved method of removing quotes even for options that can be arrays - also fixes --extra-columns='Title' [`97f2761`](https://github.com/janreges/siteone-crawler/commit/97f27611acf2fc4ed24b1e5574be84711ea3fa12)
- url skipping: if there are a lot of URLs with the same basename (ending after the last slash), we will allow a maximum of 5 requests for URLs with the same basename - the purpose is to prevent a lot of 404 from being triggered when there is an incorrect relative link to relative/my-img.jpg on all pages (e.g. on 404 page on v2.svelte.dev) [`4fbb917`](https://github.com/janreges/siteone-crawler/commit/4fbb91791f9111cc6f9d98b60732fcca7fad2f1f)
- analysis: perform most of the analysis only on URLs from domains for which we have crawling enabled [`313adde`](https://github.com/janreges/siteone-crawler/commit/313addede29ac847273b6ab6ed3a8ab878a6fb4a)
- audio & video: added audio/video file search in &lt;audio&gt; and &lt;video&gt; tags, if file crawling is not disabled [`d72a5a5`](https://github.com/janreges/siteone-crawler/commit/d72a5a51bd6863425a3d8bcffc7a9b5eb831f979)
- base practices: retexting stupid warning like '&lt;h2&gt; after &lt;h0&gt;' to '&lt;h2&gt; without previous heading [`041b383`](https://github.com/janreges/siteone-crawler/commit/041b3836a8a585158ae1a1a6fb0057b367f3a4f6)
- initial url redirect: in the case thats is entered url that redirects to another url/domain within the same 2nd-level domain (typically http-&gt;https or mydomain.tld -&gt; www.mydomain.tld redirects), we continue crawling with new url/domain and declare a new url as initial url [`166e617`](https://github.com/janreges/siteone-crawler/commit/166e617fbc893798dc7b340f43de75df2d4cf335)

#### [v1.0.7](https://github.com/janreges/siteone-crawler/compare/v1.0.6...v1.0.7)

> 22 December 2023

- version 1.0.7.20231222 + changelog [`9d2be52`](https://github.com/janreges/siteone-crawler/commit/9d2be52776c081989322953c7a31debfd4947420)
- html report template: updated logo link to crawler.siteone.io [`9892cfe`](https://github.com/janreges/siteone-crawler/commit/9892cfe5708a3da2f5fc355246dd50b2a0c5cb4f)
- http headers analysis: renamed 'Headers' to 'HTTP headers' [`436e6ea`](https://github.com/janreges/siteone-crawler/commit/436e6ea5a9914c8615bb03b444ac0aad15e31c49)
- sitemap generator: added info about crawler to generated sitemap.xml [`7cb7005`](https://github.com/janreges/siteone-crawler/commit/7cb7005bf50b8f93b421c94c57ff51eb99b45912)
- html report: refactor of all inline on* event listeners to data attributes and event listeners added from static JS inside &lt;script&gt;, so that we can disable all inline JS in the online HTML report and allow only our JS signed with hashes by Content-Security-Policy [`b576eef`](https://github.com/janreges/siteone-crawler/commit/b576eef55a5678a67928970fc51aaaefd7abd1a8)
- readme: removed HTTP auth from roadmap (it's already done), improved guide how to implement own upload endpoint and message about SMTP moved under mailer options [`e1567ae`](https://github.com/janreges/siteone-crawler/commit/e1567aee52f9d09c1cef1ad35babaf9eea388175)
- utils: hide passwords/authentication specified in cli parameters as *auth=xyz (e.g. --http-auth=abc:xyz)" in html report [`c8bb88f`](https://github.com/janreges/siteone-crawler/commit/c8bb88fc1a65ecdfd53db23fc5d972b841830837)
- readme: fixed formatting of the upload and expert options [`2d14bd5`](https://github.com/janreges/siteone-crawler/commit/2d14bd5972496989624f91617de2689601e1c027)
- readme: added Upload Options [`d8352c5`](https://github.com/janreges/siteone-crawler/commit/d8352c5acfddbeef1c1ae6498556dc296d944e0b)
- upload exporter: added possibility via --upload to upload HTML report to offline URL, by default crawler.siteone.io/html/* [`2a027c3`](https://github.com/janreges/siteone-crawler/commit/2a027c38bfdb8e6e416b9a79ebe81e809c9326d9)
- parsed-url: fixed warning in the case of url without host [`284e844`](https://github.com/janreges/siteone-crawler/commit/284e844f3f94cdb02032ddb76e51caa9a584c120)
- seo and opengraph: fixed false positives 'DENY (robots.txt)' in some cases [`658b649`](https://github.com/janreges/siteone-crawler/commit/658b6494130fa282505ec38f12aa058acf7709b9)
- best practices and inline-svgs: detection and display of the entire icon set in the HTML report in the case of &lt;svg&gt; with more &lt;symbol&gt; or &lt;g&gt; [`3b2772c`](https://github.com/janreges/siteone-crawler/commit/3b2772c59f822b7b4a6f91e15b616815b5ff92c4)
- sitemap generator: sort urls primary by number of dashes and secondary alphabetically (thanks to this, urls of the main levels will be at the beginning) [`bbc47e6`](https://github.com/janreges/siteone-crawler/commit/bbc47e6239f9693c621016a50e624698dc3d242d)
- sitemap generator: only include URLs from the same domain as the initial URL [`9969254`](https://github.com/janreges/siteone-crawler/commit/9969254e35cd8c134f85a7817de8722091f0377c)
- changelog: updated by 'composer changelog' [`0c67fd4`](https://github.com/janreges/siteone-crawler/commit/0c67fd4f8d308d8d51d5b912d9b82cc96fb6e4fb)
- package.json: used by auto-changelog generator [`6ad8789`](https://github.com/janreges/siteone-crawler/commit/6ad87895e5a8ab8bbce3d9cbf92ee5e8b8218cc0)

#### [v1.0.6](https://github.com/janreges/siteone-crawler/compare/v1.0.5...v1.0.6)

> 8 December 2023

- readme: removed bold links from the intro (it didn't look as good on github as it did in the IDE) [`b675873`](https://github.com/janreges/siteone-crawler/commit/b6758733cde67f11322a2f82573b19ec1a0edc9d)
- readme: improved intro and gif animation with the real output [`fd9e2d6`](https://github.com/janreges/siteone-crawler/commit/fd9e2d69c8f940cfaa81ad7bab86f1a74f01b0da)
- http auth: for security reasons, we only send auth data to the same 2nd level domain (and possibly subdomains). With HTTP basic auth, the name and password are only base64 encoded and we would send them to foreign domains (which are referred to from the crawled website) [`4bc8a7f`](https://github.com/janreges/siteone-crawler/commit/4bc8a7f9871064aa1c88c374aa299904409d2817)
- html report: increased specificity of the .header class for the header, because this class were also used by the generic class at &lt;td class='header'&gt; in security tab [`9d270e8`](https://github.com/janreges/siteone-crawler/commit/9d270e884545d6459f20348db71404e513ae8928)
- html report: improved readability of badge colors in light mode [`76c5680`](https://github.com/janreges/siteone-crawler/commit/76c5680397446b84f3b13800590d914b7a9b0533)
- crawler: moving the decrement of active workers after parsing URLs from the content, where further filling of the queue could occur (for this reason, queue processing could sometimes get stuck in the final stages) [`f8f82ab`](https://github.com/janreges/siteone-crawler/commit/f8f82ab61c1969952bb70f1b598ed3d97938a84e)
- analysis: do not parse/check empty HTML (it produced unnecessary warning) - it is valid to have content-type: text/html but with connect-lengt: 0 (for example case for 'gtm.js?id=') [`436d81b`](https://github.com/janreges/siteone-crawler/commit/436d81b81f905178fb972f8b5cd0236bac244bc4)

#### [v1.0.5](https://github.com/janreges/siteone-crawler/compare/v1.0.4...v1.0.5)

> 3 December 2023

- changelog: updated changelog after 3 added commits to still untagged draft release 1.0.5 [`f42fe18`](https://github.com/janreges/siteone-crawler/commit/f42fe18de89676dc0dea4dc033207c934282d04b)
- utils tests: fixed tests of methods getAbsolutePath() and getOutputFormattedPath() [`d4f4576`](https://github.com/janreges/siteone-crawler/commit/d4f4576ff566eb48495c9fb55a898b0989ef42c3)
- crawler.php: replaced preg_match to str_contains [`5b28952`](https://github.com/janreges/siteone-crawler/commit/5b289521cdbb90b6571a29cb9c880e065b852129)
- version: 1.0.5.20231204 + changelog [`7f2e974`](https://github.com/janreges/siteone-crawler/commit/7f2e9741fab25e9369151bc2d79a38b8827e2463)
- option: replace placeholders like a '%domain' also in validateValue() method because there is also check if path is writable with attempt to mkdir [`329143f`](https://github.com/janreges/siteone-crawler/commit/329143fa23925ea523504735b3f724c026fe5ac6)
- swoole in cygwin: improved getBaseDir() to work better even with the version of Swoole that does not have SCRIPT_DIR [`94cc5af`](https://github.com/janreges/siteone-crawler/commit/94cc5af4411a8c7427ee136a937ac629b8637668)
- html processor: it must also process the page with the redirect, because is needed to replace the URL in the meta redirect tag [`9ce0eee`](https://github.com/janreges/siteone-crawler/commit/9ce0eeeebe1e524b9d46d91dd4cecb2e796db8c3)
- sitemap: use formatted output path (primary for better output in Cygwin environment with needed C:/foo &lt;-&gt; /cygwin/c/foo conversion) [`6297a7f`](https://github.com/janreges/siteone-crawler/commit/6297a7f4069f9e09c013268e0df896db2fa91dec)
- file exporter: use formatted output path (primary for better output in Cygwin environment with needed C:/foo &lt;-&gt; /cygwin/c/foo conversion) [`426cfb2`](https://github.com/janreges/siteone-crawler/commit/426cfb2b32f854d65abfce841e4e4f4badf04fef)
- options: in the case of dir/file validation, we want to work with absolute paths for more precise error messages [`6df228b`](https://github.com/janreges/siteone-crawler/commit/6df228bdfc87a2c9fb6eee611fdc87d976b7f721)
- crawler.php: improved baseDir detection - we want to work with absolute path in all scenarios [`9d1b2ce`](https://github.com/janreges/siteone-crawler/commit/9d1b2ce9bedb15ede90bcee9641e1cfc62b9c3cc)
- utils: improved getAbsolutePath() for cygwin and added getOutputFormattedPath() with reverse logic for cygwin (C:/foo/bar &lt;-&gt; /cygdrive/c/foo/bar) [`161cfc5`](https://github.com/janreges/siteone-crawler/commit/161cfc5c4fd3fa3675cade409d7d5e11db2da0c6)
- offline export: renamed --offline-export-directory to --offline-export-dir for consistency with --http-cache-dir or --result-storage-dir [`26ef45d`](https://github.com/janreges/siteone-crawler/commit/26ef45d145a1a02a5313067e6298571e26d9618b)

#### [v1.0.4](https://github.com/janreges/siteone-crawler/compare/v1.0.3...v1.0.4)

> 30 November 2023

- dom parsing: handling warnings in case of impossibility to parse some DOM elements correctly, fixes #3 [`#3`](https://github.com/janreges/siteone-crawler/issues/3)
- version: 1.0.4.20231201 + changelog [`8e15781`](https://github.com/janreges/siteone-crawler/commit/8e15781265cdd9cce10d9dcde57d46b57b50e1cf)
- options: ignore empty values in the case of directives with the possibility of repeated definition [`5e30c2f`](https://github.com/janreges/siteone-crawler/commit/5e30c2f8ad6cf00ad819ba1d7d6ec4e6c95a7113)
- http-cache: now the http cache is turned off using the 'off' value (it's more understandable) [`9508409`](https://github.com/janreges/siteone-crawler/commit/9508409fbba2d96dc92cd73bed5abe462d5cea15)
- core options: added --console-width to enforce the definition of the console width and disable automatic detection via 'tput cols' on macOS/Linux or 'mode con' on Windows (used by Electron GUI) [`8cf44b0`](https://github.com/janreges/siteone-crawler/commit/8cf44b06616e15301c486146a7c6b1003ce5137f)
- gui support: added base-dir detection for Windows where the GUI crawler runs in Cygwin [`5ce893a`](https://github.com/janreges/siteone-crawler/commit/5ce893a66c7f1e21af025603b66223e04246e029)
- renaming: renamed 'siteone-website-crawler' to 'siteone-crawler' and 'SiteOne Website Crawler' to 'SiteOne Crawler' [`64ddde4`](https://github.com/janreges/siteone-crawler/commit/64ddde4b53f16679a8c4671c98b3f9c619d94b42)
- utils: fixed color-support detection [`62dbac0`](https://github.com/janreges/siteone-crawler/commit/62dbac07d15ecfa0ff677c277e2a3381a47025bf)
- core options: added --force-color options to bypass tty detection (used by Electron GUI) [`607b4ad`](https://github.com/janreges/siteone-crawler/commit/607b4ad8583845adea209f75edfa27870ac23f9d)
- best practice analysis: in the case of checking an image (e.g. for the existence of WebP/AVIF), we also want to check external images, because very often websites have images linked from external domains or services for image modification or optimization [`6100187`](https://github.com/janreges/siteone-crawler/commit/6100187347e0bbba6270335e2d9b2faf37475333)
- html report: set scaleDown as default object-fit for image gallery [`91cd300`](https://github.com/janreges/siteone-crawler/commit/91cd300dcd7455c2b9be548fb2746cea7fd7c904)
- offline exporter: added short -oed as alias to --offline-export-directory [`22368d9`](https://github.com/janreges/siteone-crawler/commit/22368d9a892aab8011aa4a0884bf01a8560f6167)
- image gallery: list of all images on the website (except those from the srcset, where there would be duplicates only in other sizes or formats), including SVG with rich filtering options (through image format, size and source tag/attribute) and the option of choosing small/medium/view and scale-down/contains/cover for object-fit css property [`43de0af`](https://github.com/janreges/siteone-crawler/commit/43de0af1c60d398f91b373c192d1a35ac2df2fd1)
- core options: added a shortened version of the command name consisting of only one hyphen and the first letters of the words of the full command (e.g. --memory-limit has short version -ml), added getInitialScheme() [`eb9a3cc`](https://github.com/janreges/siteone-crawler/commit/eb9a3cc62dffc58be2701c52bb21509d39a5dfad)
- visited url: added 'sourceAttr' with information about where the given URL was found and useful helper methods [`6de4e39`](https://github.com/janreges/siteone-crawler/commit/6de4e39c5f8b9ba685e3865193274ccf0ee91a3d)
- found urls: in the case of the occurrence of one URL in several places/attributes, we consider the first one to be the main one (typically the same URL in src and then also in srcset) [`660bb2b`](https://github.com/janreges/siteone-crawler/commit/660bb2b2bd2cb6949fe9c573e72b31e9fb97a9fe)
- url parsing: added more recognition of which attributes the given URL address was parsed from (we need to recognize src and srcset for ImageGallery in particular) [`802c3c6`](https://github.com/janreges/siteone-crawler/commit/802c3c66a40087745e68f47392f0e6e8e9725171)
- supertable and urls: in removing the redundant hostname for a more compact URL output, we also take into account the scheme http:// or https:// of initial URL (otherwise somewhere it lookedlike duplicate) + prevention of ansi-color definitions for bash in the HTML output [`915469e`](https://github.com/janreges/siteone-crawler/commit/915469e2a4a6d0fed337ca70efe9170758751ade)
- title/description/keywords parsing: added html entities decoding because some website uses decoded entities with &#xED; &#x2013; etc [`920523d`](https://github.com/janreges/siteone-crawler/commit/920523d3c55baf6cd7b2602334d9776b3e40f4d7)
- crawler: added 'sourceAttr' to the swoole table queue and already visited URLs (we will use it in the Image Gallery for filtering, so as not to display unnecessarily and a lot of duplicate images only in other resolutions from the srcsets) [`0345abc`](https://github.com/janreges/siteone-crawler/commit/0345abc6dab770e3196dd88ff0123a2050828644)
- url parameter: it is already possible not to enter the scheme and https:// or http:// will be added automatically (http:// for e.g. for localhost) [`85e14e9`](https://github.com/janreges/siteone-crawler/commit/85e14e961b53b83c208ac936972a335cace61bf8)
- disabled images: in the case of a request to remove the images, replace their body with a 1x1px transparent gif and place a semi-transparent hatch with the crawler logo and opacity as a background [`c1418c3`](https://github.com/janreges/siteone-crawler/commit/c1418c3154301fd3995dde421b066f16850203e7)
- url regex filtering: added option , which will allow you to limit the list of crawled pages according to the declared regexps, but at the same time it will allow you to crawl and download assets (js, css, images, fonts, documents, etc.) from any URL (but with respect to allowed domains) [`21e67e5`](https://github.com/janreges/siteone-crawler/commit/21e67e5be74050cd5b7c9998654ed66f18db4d85)
- img srcset parsing: because a valid URL can also contain a comma (and various dynamic parametric img generators use them) and in the srcset a comma+whitespace should be used to separate multiple values, this is also reflected in the srcset parsing [`0db578b`](https://github.com/janreges/siteone-crawler/commit/0db578bda37c024b2b111c814e35c2107e4751ad)
- websocket server: added option to set --websocket-server, which starts a parallel process with the websocket server, through which the crawler sends various information about the progress of crawling (this will also be used by Electron UI applications) [`649132f`](https://github.com/janreges/siteone-crawler/commit/649132f8965421cd1bb3570fbb9f534e6caef313)
- http client: handle scenario when content loaded from cache is not valid (is_bool) [`1ddd099`](https://github.com/janreges/siteone-crawler/commit/1ddd099ecdadc5752016237ec1f0acf80e907dc8)
- HTML report: updated logo with final look [`2a3bb42`](https://github.com/janreges/siteone-crawler/commit/2a3bb428180067a649f2467419920b3d4f70a9fd)
- mailer: shortening and simplifying email content [`e797107`](https://github.com/janreges/siteone-crawler/commit/e7971071f8c5e4cff1472464ce9ec4407c198a59)
- robots.txt: added info about loaded robots.txt to summary (limited to 10 domains for case of huge multi domain crawling) [`00f9365`](https://github.com/janreges/siteone-crawler/commit/00f93659637705bc6389c5f073a29f09b743370f)
- redirects analyzer: handled edge case with empty url [`e9be1e3`](https://github.com/janreges/siteone-crawler/commit/e9be1e350b1d114c54b7099b54277da23467b538)
- text output: added fancy banner with crawler logo (thanks to great SiteOne designers!) and smooth effect [`e011c35`](https://github.com/janreges/siteone-crawler/commit/e011c35f3cbc87fceb9d7a9c56c726817c79b543)
- content processors: added applyContentChangesBeforeUrlParsing() and better NextJS chunks handling [`e5c404f`](https://github.com/janreges/siteone-crawler/commit/e5c404f2d52a7c2ebdb80ae3c93760c7e881dc9a)
- url searches: added ignoring data:, mailto:, tel:, file:// and other non-requestable resources also to FoundUrls [`5349be2`](https://github.com/janreges/siteone-crawler/commit/5349be242f99567b8f5f093537a696ef5fd319ac)
- crawler: added declare(strict_types=1) and banner [`27134d2`](https://github.com/janreges/siteone-crawler/commit/27134d29d16e3e24c633f010f731f11deeeadcb7)
- heading structure analysis: highlighting and calculating errors for duplicate &lt;h1&gt; + added help cursor with a hint [`f5c7db6`](https://github.com/janreges/siteone-crawler/commit/f5c7db6206ed06e0cbaf38a7ae2505be573da2e6)
- core options: added --help and --version, colorized help [`6f1ada1`](https://github.com/janreges/siteone-crawler/commit/6f1ada112898580d2de028c02e32fdeb8ad2a845)
- ./crawler binary - send output of cd - to /dev/null and hide unwanted printed script path [`16fe79d`](https://github.com/janreges/siteone-crawler/commit/16fe79d08e24c4a6fbd87d16417413725aaa24e8)
- README: updated paths in the documentation - it is now possible to use the ERROR: Option --url () must be valid URL [`86abd99`](https://github.com/janreges/siteone-crawler/commit/86abd998da94971c2512b6018085f39e8dd5db7f)
- options: --workers default for Cygwin runtime is now 1 (instead of 3), because Cygwin runtime is highly unstable when workers &gt; 1 [`f484960`](https://github.com/janreges/siteone-crawler/commit/f4849606fb382e1b759f547c4f1bfe2e5d8b4d02)

#### [v1.0.3](https://github.com/janreges/siteone-crawler/compare/v1.0.2...v1.0.3)

> 10 November 2023

- version: 1.0.3.20231110 + changelog [`5b80965`](https://github.com/janreges/siteone-crawler/commit/5b8096550dcd489a998d34fae44e3d99375e33e3)
- cache/storage: better race-condition handling in a situation where several coroutines could write the same folder at one time, then mkdir reported 'File exists' [`be543dc`](https://github.com/janreges/siteone-crawler/commit/be543dc195e675e49064b20ee091903f1977942a)

#### [v1.0.2](https://github.com/janreges/siteone-crawler/compare/v1.0.1...v1.0.2)

> 10 November 2023

- version: 1.0.2.20231110 + changelog [`230b947`](https://github.com/janreges/siteone-crawler/commit/230b9478a36ee664dfe080447c09da9c4a9bc25c)
- html report: added aria labels to active/important elements [`a329b9d`](https://github.com/janreges/siteone-crawler/commit/a329b9d4e0f040996c17cb3382cf3c07c61a4b35)
- version: 1.0.1.20231109 - changelog [`50dc69c`](https://github.com/janreges/siteone-crawler/commit/50dc69c9ab956691bbf97860355d410a0bdba0c9)

#### [v1.0.1](https://github.com/janreges/siteone-crawler/compare/v1.0.0...v1.0.1)

> 9 November 2023

- version: 1.0.1.20231109 [`e213cb3`](https://github.com/janreges/siteone-crawler/commit/e213cb326db78e2f69fd3e4f04b9728223550a3d)
- offline exporter: fixed case when on https:// website is link to same path but with http:// protocol (it overrided proper *.html file just with meta redirect .. real case from nextjs.org) [`4a1be0b`](https://github.com/janreges/siteone-crawler/commit/4a1be0bdfb62167c498f6c3b4c91fe74532ff833)
- html processor: force to remove all anchor listeners when NextJS is detected (it is very hard to achive a working NextJS with offline file:// protocol) [`2b1d935`](https://github.com/janreges/siteone-crawler/commit/2b1d935419bade80d8e6ab07b2ae04ded0df131e)
- file exporters: now by default crawler generates a html/json/txt report to 'tmp/[report|output].%domain%.%datetime%.[html|json|txt]' .. i assume that most people will want to save/see them [`7831c6b`](https://github.com/janreges/siteone-crawler/commit/7831c6b87dd41444a0fca529bc450bf7934ef541)
- security analysis: removed multi-line console output for recommendations .. it was ugly [`310af30`](https://github.com/janreges/siteone-crawler/commit/310af308859dbb2fd5895af468195e2339f2788d)
- json output: added JSON_UNESCAPED_UNICODE for unescaped unicode chars (e.g. czech chars will be readable) [`cf1de9f`](https://github.com/janreges/siteone-crawler/commit/cf1de9f60820963ccb78a00b43ca3aec8b311a77)
- mailer: do not send e-mails in case of interruption of the crawler using ctrl+c [`19c94aa`](https://github.com/janreges/siteone-crawler/commit/19c94aac8211b4550ba11497e1332d604f8cdbc7)
- refactoring: manager stats logic extracted into ManagerStats and implemented also into manager of content processors + stats added into 'Crawler stats' tab in HTML report [`3754200`](https://github.com/janreges/siteone-crawler/commit/3754200652dc91ac05efe22812e64c0e4be84019)
- refactoring: content related logic extracted to content processors based on ContentProcessor interface with methods findUrls():?FoundUrls, applyContentChangesForOfflineVersion():void and isContentTypeRelevant():bool + better division of web framework related logic (NextJS, Astro, Svelte, ...) + better URL handling and maximized usage of ParsedUrl [`6d9f25c`](https://github.com/janreges/siteone-crawler/commit/6d9f25ce82f8a1cfbfbc6bc0b5a6a07262c427b1)
- phpstan: ignore BASE_DIR warning [`6e0370a`](https://github.com/janreges/siteone-crawler/commit/6e0370aafe02d3bb2ca528ea8a9a37995f5ddce6)
- offline website exporter: improved export of a website based on NextJS, but it's not perfect, because latest NextJS version do not have some JS/CSS path in code, but they are generated dynamicly from arrays/objects [`c4993ef`](https://github.com/janreges/siteone-crawler/commit/c4993efcb97f7058834713ed273f9c4274be5cad)
- seo analyzer: fixed trim() warning when no &lt;h1&gt; found [`f0c526f`](https://github.com/janreges/siteone-crawler/commit/f0c526f5d2ff7d0155c1bfc7da7a6c0f2f7a1419)
- offline export: a lot of improvements when generating the offline version of the website on NextJS - chunk detection from the manifest, replacing paths, etc. [`98c2e15`](https://github.com/janreges/siteone-crawler/commit/98c2e15acf4e22d25301d160968555c19ddd44cc)
- seo and og: fixed division by zero when no og/twitter tags found [`19e4259`](https://github.com/janreges/siteone-crawler/commit/19e4259c519a3e41eb7aa8eabce80e6364e74639)
- console output: lots of improvements for nice, consistent and minimal word-wrap output [`596a5dc`](https://github.com/janreges/siteone-crawler/commit/596a5dc17945359ffc0fef2ed8ed8ee8bfc1db00)
- basic file/dir structure: created ./crawler (for Linux/macOS) and ./crawler.bat for Windows, init script moved to ./src, small related changes about file/dir path building [`5ce41ee`](https://github.com/janreges/siteone-crawler/commit/5ce41ee8e78425747bf40327152bd99499c64013)
- header status: ignore too dynamic Content-Disposition header [`4e0c6fd`](https://github.com/janreges/siteone-crawler/commit/4e0c6fdf5c356f8c0eea78ccebe29641b90f96b4)
- offline website exporter: added .html extensions to typical dynamic language extensions, because without it the browser will show them as source code [`7130b9e`](https://github.com/janreges/siteone-crawler/commit/7130b9eb666eca5b08c9dbeda91198bc85b31379)
- html report: show tables with details, even if they are without data (it is good to know that the checks were carried out, but nothing was found) [`da019e4`](https://github.com/janreges/siteone-crawler/commit/da019e4591682c21e9f78de1ec26939088d92ccc)
- tests: repaired tests after last changes of file/url building for offline website .. merlot is great! [`7c77c41`](https://github.com/janreges/siteone-crawler/commit/7c77c411ff67c01e07d16cb2acce0e926b264fcd)
- utils: be more precise and do not replace attributes in SVG .. creative designers will not love you when looking at the broken SVG in HTML report [`3fc81bb`](https://github.com/janreges/siteone-crawler/commit/3fc81bb0c47eef2935da2e74721a809a9aff0959)
- utils: be more precise in parsing phone numbers, otherwise people will 'love' you because of false positives .. wine is still great [`51fd574`](https://github.com/janreges/siteone-crawler/commit/51fd574c764d832d74cb5e67eed890bd9d349a5c)
- html parser: better support for formatted html with tags/attributes on multiple lines [`89a36d2`](https://github.com/janreges/siteone-crawler/commit/89a36d2fcf3d96b61c4b3d2e20d5a46f4cb96cb8)
- utils: don't be hungry in stripJavaScript() because you ate half of my html :) wine is already in my head... [`0e00957`](https://github.com/janreges/siteone-crawler/commit/0e0095727638b7940d2e555a6be231ad3dde19e4)
- file result storage: changed cache directory structure for consistency with http client's cache, so it looks like my.domain.tld-443/04/046ec07c.cache [`26bf428`](https://github.com/janreges/siteone-crawler/commit/26bf428f95bc428485d7cf505e74c8a69c94d869)
- http client cache: for better consistency with result storage cache, directory structure now contains also port, so it looks like my.domain.tld-443/b9/b989bdcf2b9389cf0c8e5edb435adc05.cache [`a0b2e09`](https://github.com/janreges/siteone-crawler/commit/a0b2e09d01e36aed56c0208a8001d616755de096)
- http client cache: improved directory structure for large scale and better orientation for partial cache deleting.. current structure in tmp dir: my.domain.tld/b9/b989bdcf2b9389cf0c8e5edb435adc05.cache [`10e02c1`](https://github.com/janreges/siteone-crawler/commit/10e02c189297f28ea563ba6f3792462c2d6790ea)
- offline website exporter: better srcset handling - urls can be defined with or without sizes [`473c1ad`](https://github.com/janreges/siteone-crawler/commit/473c1ad0d753df209aa160b0d90687c4bff21912)
- html report: blue color for search term, looks better [`cb47df9`](https://github.com/janreges/siteone-crawler/commit/cb47df98e230c0375dbcb14c278250709bf3644a)
- offline website exporter: handled situation of the same-name folder/file when both the folder /foo/next.js/ and the file /foo/next.js existed on the website (real case from vercel.com) [`7c27d2c`](https://github.com/janreges/siteone-crawler/commit/7c27d2c2277dd134615563ee4eaa706ec0ee7485)
- exporters: added exec times to summary messages [`41c8873`](https://github.com/janreges/siteone-crawler/commit/41c8873dc33d7f08d91f77d71fcf1bf2fafa30ae)
- crawler: use port from URL if defined or by scheme .. previous solution didn't work properly for localhost:port and parsed URLs to external websites [`324ba04`](https://github.com/janreges/siteone-crawler/commit/324ba04267b962a56817dd10e3ecba7777702aa2)
- heading analysis: changed sorting to DESC by errors, renamed Headings structure -&gt; Heading structure [`dbc1a38`](https://github.com/janreges/siteone-crawler/commit/dbc1a38f33d4094aebe64020531518538e2b3baf)
- security analysis: detection and ignoring of URLs that point to a non-existent static file but return 404 HTML, better description [`193fb7d`](https://github.com/janreges/siteone-crawler/commit/193fb7dcf1f994aba69b646576bf7c6f8701a975)
- super table: added escapeOutputHtml property to column for better escape managing + updated related supertables [`bfb901c`](https://github.com/janreges/siteone-crawler/commit/bfb901cb82b9cda81198df0dc87885b5eceb5c93)
- headings analysis: replace usage of DOMNode-&gt;textContent because when the headings contain other tags, including &lt;script&gt;, textContent also contains JS code, but without the &lt;script&gt; tag [`5c426c2`](https://github.com/janreges/siteone-crawler/commit/5c426c24969a063aa3366da02520025733cf16e7)
- best practices: better missing quotes detection and minimizing false positives in special cases (HTML/JS in attributes, etc.) [`b03a534`](https://github.com/janreges/siteone-crawler/commit/b03a5345e7f71f880ee4d36fb9f51c230d8c772f)
- best practices: better SVG detection and minimizing false positives (e.g. code snippets with SVG), improved look in HTML report and better descriptions [`c35f7e2`](https://github.com/janreges/siteone-crawler/commit/c35f7e226f6cd384e5c8cf4b9af3a1a0d3be4cfc)
- headers analysis: added [ignored generic values] or [see values below] for specific headers [`a7b444d`](https://github.com/janreges/siteone-crawler/commit/a7b444dab0e1c3949abfa0e0746db18343b9b55d)
- core options: changed --hide-scheme-and-host to --show-scheme-and-host (by default is hidden schema+host better) [`3c202e9`](https://github.com/janreges/siteone-crawler/commit/3c202e998a824f97b6f481575a24e2924c9dc663)
- truncating: replaced '...' with '…' [`870cf8c`](https://github.com/janreges/siteone-crawler/commit/870cf8cd447fd14e389d76bcc8853b1e691f5349)
- accessibility analyzer: better descriptions [`514b471`](https://github.com/janreges/siteone-crawler/commit/514b47124d101cd4f0bd67148f41ea5644febd62)
- crawler & http client: if the response is loaded from the cache, we do not wait due to rate limiting - very useful for repeated executions [`61fbfab`](https://github.com/janreges/siteone-crawler/commit/61fbfab34ba07c1856099051b8f68dc76b1adf09)
- header stats: added missing strval in values preview [`9e11030`](https://github.com/janreges/siteone-crawler/commit/9e1103064af0962ed4963cace61bf7ad201d19a2)
- content type analyzer: increased column width for MIME type from 20 to 26 (enough for application/octet-stream) [`c806674`](https://github.com/janreges/siteone-crawler/commit/c806674ee82d0aba90a9d61e10ff2b5e2cf6c813)
- SSL/TLS analyzer: fixed issues on Windows with Cygwin where nslookup does not work reliably [`714b9e1`](https://github.com/janreges/siteone-crawler/commit/714b9e12a2426574731b62d460c98f1fed95aa18)
- text output: removed redundant whitespaces from banner after .YYYYMMDD was added to the version number [`8b76205`](https://github.com/janreges/siteone-crawler/commit/8b76205b41ca9cbf4dd32e7d908f4fe932c4a2a3)
- readme: added link to #ready-to-use-releases to summary [`574b39e`](https://github.com/janreges/siteone-crawler/commit/574b39e836794c98e7be8ceaa81d1ab0c50ab149)
- readme: added section Ready-to-use releases [`44d686b`](https://github.com/janreges/siteone-crawler/commit/44d686b910a36747d002ec2886b85c22be5c4864)
- changelog: added changelog by https://github.com/cookpete/auto-changelog/tree/master + added 'composer changelog' [`d11af7e`](https://github.com/janreges/siteone-crawler/commit/d11af7e4d847362276e1dd4cec3c25cad38263fb)

#### v1.0.0

> 7 November 2023

- proxy: added support for --proxy=&lt;host:port&gt;, closes #1 [`#1`](https://github.com/janreges/siteone-crawler/issues/1)
- license: renamed to LICENSE.md [`c0f8ec2`](https://github.com/janreges/siteone-crawler/commit/c0f8ec22a68741b1740981dc98bdec13d8e5182a)
- license: added license CC 4.0 BY [`bd5371b`](https://github.com/janreges/siteone-crawler/commit/bd5371b99363fbb5de29c33f0fcc572d154e467d)
- version: set v1.0.0.20231107 [`bdbf2be`](https://github.com/janreges/siteone-crawler/commit/bdbf2be97e68cfa01fb992fb960c1c5313d5780f)
- version: set v1.0.0 [`a98e61e`](https://github.com/janreges/siteone-crawler/commit/a98e61e161652861541743df6fe1d8c55be446f9)
- SSL/TLS analyzer: uncolorize valid-to in summary item, phpstan fixes (non-funcional changes) [`88d1d9f`](https://github.com/janreges/siteone-crawler/commit/88d1d9fec8bc29cd26ab88c18d6c122939b59bba)
- content type analyzer: added table with MIME types [`b744f13`](https://github.com/janreges/siteone-crawler/commit/b744f139e417b625bd22ea282f744b55406853b1)
- seo analysis: added TOP10 non-unique titles and descriptions to tab SEO and OpenGraph + badges [`4ae14c1`](https://github.com/janreges/siteone-crawler/commit/4ae14c13be5163704c2c6a2d55d75bc83f41f801)
- html report: increased sidebar width to prevent wrapping in the case of higher numbers in badges [`c5c8f4c`](https://github.com/janreges/siteone-crawler/commit/c5c8f4cae991bbdd6b6a8a7fab6cbaae1c199344)
- dns analyzer: increased column size to prevent auto-truncation of dns/ip addresses [`b4d4127`](https://github.com/janreges/siteone-crawler/commit/b4d4127b2b67efd63fff53ae0ad27b6c9a987501)
- html report: fixed badge with errors on DNS and SSL tab [`e290403`](https://github.com/janreges/siteone-crawler/commit/e29040349ac4966b22842e52ee4c102a67f9860c)
- html report: ensure that no empty tabs will be in report (e.g. in case where all analyzers will be deactivated by --analyzer-filter-regex='/anything/') [`6dd5bcc`](https://github.com/janreges/siteone-crawler/commit/6dd5bcc67d215bca085ef75cb98398aa162ce5fa)
- html report: improved replacement of non-badged cells to transparent badge for better alignment [`172a074`](https://github.com/janreges/siteone-crawler/commit/172a074c519a55c492d2b72250232e23749cd75b)
- html report: increased visible part of long tables from 500px to 658px (based on typical sidebar height), updated title [`0be355f`](https://github.com/janreges/siteone-crawler/commit/0be355f5474ad6aff461ac3362127569d29eac22)
- utils: selected better colors for ansi-&gt;html conversion [`6c2a8e3`](https://github.com/janreges/siteone-crawler/commit/6c2a8e364790e2cdb338f164c572aafd9e3db6c1)
- SSL/TLS analyzer: evaluation and hints about unsafe or recommeneded protocols, from-to validation, colorized output [`5cea1fe`](https://github.com/janreges/siteone-crawler/commit/5cea1fe51d500db433c4d86fe5fa8660d2ef2a14)
- SEO & OpenGraph analyzers: refactored class names, headings structure moved to own tab, other small improvements [`75a9724`](https://github.com/janreges/siteone-crawler/commit/75a97245af1e896ab3304891dd4459873ad3a26f)
- security analyzer: bette vulnerabilities explanation and better output formatting [`ee172cb`](https://github.com/janreges/siteone-crawler/commit/ee172cb25073e2e5452b38d5a6c52802e9585bcc)
- summary: selected more suitable icons from the utf-8 set that work well in the console and HTML [`ef67483`](https://github.com/janreges/siteone-crawler/commit/ef67483827755895f0edf3149f4f106d28ba1942)
- header stats: addValue() can accept both string and array [`a0d746b`](https://github.com/janreges/siteone-crawler/commit/a0d746ba9f956c03cb4ad1bddee14a26951ff86d)
- headers & redirects - text improvements [`3ac9010`](https://github.com/janreges/siteone-crawler/commit/3ac9010c33e9048f1b3d24182232ae182ae681ca)
- dns analyzer: colorized output and added info about CNAME chain into summary [`7dd1f8a`](https://github.com/janreges/siteone-crawler/commit/7dd1f8ac1eafcdcd92f651d397b561f6383fdcfc)
- best practices analyzer: added SVG sanitization to prevent XSS, fine-tuning of missing quotes detection, typos [`4dc1eb5`](https://github.com/janreges/siteone-crawler/commit/4dc1eb592de3631f61ed67dfb87466a95462d5f3)
- options: added extras option, e.g. for number range validation [`760a865`](https://github.com/janreges/siteone-crawler/commit/760a865082a7cd5f8e439f3fc9094fb7503a78be)
- seo and socials: small type-hint and phpstan fixes [`bf695be`](https://github.com/janreges/siteone-crawler/commit/bf695be5fa859ca49bef67fb6511039e4301bb34)
- best practice analyzer: added found depth to messages about too deep DOM depth [`220b43c`](https://github.com/janreges/siteone-crawler/commit/220b43c77a6d4747a29cf483e11a985dc07ac460)
- analysis: added SSL/TLS analyzer with info about SSL certificate, its validity, supported protocols, issuer .. in the report SSL/TLS info are under tab 'DNS and TLS/SSL' [`3daf175`](https://github.com/janreges/siteone-crawler/commit/3daf1757e1eee765ea3d6b2dca1ed55ffb694d4a)
- super table: show fulltext only for &gt;= 10 rows + visible height of the table in HTML shorten to 500px/20 rows and show 'Show entire table' link .. implemented only with HTML+CSS, so that it also works on devices without JS (e.g. e-mail browser on iOS) [`7fb9e52`](https://github.com/janreges/siteone-crawler/commit/7fb9e52de2514b0fc1a11032238de815f76acb37)
- analysis: added seo & sharing analysis - meta info (title, h1, description, keywords), OG/Twitter data, heading structure details [`53e12e6`](https://github.com/janreges/siteone-crawler/commit/53e12e63102d70b0329194493599523808758716)
- best practices: added checks for WebP and AVIF images [`0ccabc6`](https://github.com/janreges/siteone-crawler/commit/0ccabc633cdae4b7ef7b03aad22ab8cfab1a590f)
- best practices: added brotli support reporting to tables [`7ff2c53`](https://github.com/janreges/siteone-crawler/commit/7ff2c53e56705c19de77d54db578338252007b99)
- super table: added option to specify whether the table should be displayed on the output to the console, html or json [`6bb6217`](https://github.com/janreges/siteone-crawler/commit/6bb62177522a61bab1673b9d5f19e18f50bd54a3)
- headers analysis: analysis of HTTP headers of all requests to the main domain, their detailed breakdown, values and statistics [`1fcc1db`](https://github.com/janreges/siteone-crawler/commit/1fcc1dba38a3ac41f0547a4f11a2aef9af1d876f)
- analysis: fixed search of attributes with missing quotes [`3db31b9`](https://github.com/janreges/siteone-crawler/commit/3db31b9c01317d8c8ac6eba6b98679be79982c3e)
- super table: added the number of found/displayed lines next to the full text [`6e7f3d4`](https://github.com/janreges/siteone-crawler/commit/6e7f3d4b4de0cfa378920c9389291a9902c0c486)
- super table: removed setting column widths for HTML table - works best without forcing widths [`2a785e7`](https://github.com/janreges/siteone-crawler/commit/2a785e70b675ef681b005042a50b289b3b29d600)
- html report: even wider content of the report is allowed, for better functioning for high-resolution displays [`363990c`](https://github.com/janreges/siteone-crawler/commit/363990c3566cb39d653ab2760df6bb4d2acd8149)
- pages 404: truncate too long urls [`082bae6`](https://github.com/janreges/siteone-crawler/commit/082bae6f28d2ba8296591a0885548faa0b38a59a)
- fixes: fixed various minor warnings related to specific content or parameters [`da1802d`](https://github.com/janreges/siteone-crawler/commit/da1802d82f8ccf2de3f4329bf3b952ebefeb3449)
- options: ignore extra comma or empty value in list [`3f5cab6`](https://github.com/janreges/siteone-crawler/commit/3f5cab68bc4981faea7b7bed30b9f687ea773830)
- super table: added useful fulltext search for all super tables [`50a4edf`](https://github.com/janreges/siteone-crawler/commit/50a4edf9caa69f67fdc21c3c32a92d201c211ccc)
- colors: more light color for badge.neutral in light mode because previous was too contrasting [`0dbad09`](https://github.com/janreges/siteone-crawler/commit/0dbad0920f8f8a9f14186f9513e3ea6793fcf297)
- colors: notice is now blue instead of yellow and severity order fix in some places (critical -&gt; warning -&gt; notice -&gt; ok -&gt; info) [`1b50b99`](https://github.com/janreges/siteone-crawler/commit/1b50b99ae079a4d1cdc350038e105d469dec524a)
- colors: changed gray color to more platform-consistent color, otherwise gray was too dark on macOS [`173c9bd`](https://github.com/janreges/siteone-crawler/commit/173c9bd211bf066b69bb3adbde487ec3e99f6da1)
- scripts: removed helper run.tests* scripts [`e9f0c8f`](https://github.com/janreges/siteone-crawler/commit/e9f0c8ff768042737bfab57b5d2270df995c611e)
- analysis: added table with detailed list of security findings and URLs [`5b9e0fe`](https://github.com/janreges/siteone-crawler/commit/5b9e0fe1c3a514941abf2e277bf3f2bd4e017004)
- analysis: added SecurityAnalyzer, which checks the existence and values of security headers and performs HTML analysis for common issues [`0cb7cb9`](https://github.com/janreges/siteone-crawler/commit/0cb7cb9daac5303227e31b72b0f6931218968bf7)
- http auth: added support for basic HTTP authentication by --http-auth=username:password [`147e004`](https://github.com/janreges/siteone-crawler/commit/147e0040e97f6ad37da7897813063cbb73302e22)
- error handling: improved behaviour in case of entering a non-existent domain or problems with DNS resolving [`5c08fb4`](https://github.com/janreges/siteone-crawler/commit/5c08fb4c82409863f73fcdcd66f9a0ba76206c5c)
- html report: implemented completely redesigned html report with useful information, with light/dark mode and possibility to sort tables by clicking on the header .. design inspired by Zanrly from Shuffle.dev [`05da14f`](https://github.com/janreges/siteone-crawler/commit/05da14f50b108deec4827c5c0324bbd1b9775b37)
- http client: fix of extension detection in the case of very non-standard or invalid URLs [`113faa5`](https://github.com/janreges/siteone-crawler/commit/113faa501016f14c017f5f1eaa586a6fae35efbf)
- options: increased default memory limit from 512M to 2048M + fixed refactored 'file-system' -&gt; 'file' in docs for result storage [`1471b28`](https://github.com/janreges/siteone-crawler/commit/1471b2884bcbf1806a388e4ae85cc4f7e1bc11fe)
- utils: fix that date formats are not detected as a phone number in parsePhoneNumbersFromHtml() [`e4e1009`](https://github.com/janreges/siteone-crawler/commit/e4e10097f7e74816dd716d2713516d5ff8eef39a)
- strict types: added declare(strict_types=1) to all classes with related fixes and copyright [`92dd47c`](https://github.com/janreges/siteone-crawler/commit/92dd47c72e4f1aaa5a05187f60f2a9f0a5c285ee)
- dns analyzer: added information about the DNS of the given domain - shows the entire cname/alias chain as well as the final resolved IPv4/IPv6 addresses + tests [`199421d`](https://github.com/janreges/siteone-crawler/commit/199421df3c96e2f2bec20f45230cbd812e9fc21c)
- utils: helper function parsePhoneNumbersFromHtml() used in BestPracticeAnalyzer + tests [`09cc5fb`](https://github.com/janreges/siteone-crawler/commit/09cc5fbbbdf7f4a706ef912221e32d476fa397b4)
- summary consistency: forced dots at the end of each item in the summary list [`4758e38`](https://github.com/janreges/siteone-crawler/commit/4758e38c3b2ab73476516662129e3b6abd78ff44)
- crawler: support for more benevolent tags for title and meta attributes .. e.g. even the title can contain other HTML attributes [`770b339`](https://github.com/janreges/siteone-crawler/commit/770b339fb7b6ac86af56a864feb184977974d37d)
- options: default timeout increased from 3 to 5 seconds .. after testing on a lot websites, it makes better sense [`eb74207`](https://github.com/janreges/siteone-crawler/commit/eb7420736f5c4d353651ec39d8d030a8485e1486)
- super table: added option to force non-breakable spaces in column cells [`3500818`](https://github.com/janreges/siteone-crawler/commit/35008185064331d33c380e0643606f2dbaeb2b64)
- best practice analyzer: added measurement of individual steps + added checking of active links with phone numbers &lt;a href="tel: 123..."&gt; [`1bb39e8`](https://github.com/janreges/siteone-crawler/commit/1bb39e87a440975e8956fbf1d66b81ef1b424574)
- accessibility analyzer: added measurement of individual steps + removed DOMDocument parsing after refactoring [`2a7c49b`](https://github.com/janreges/siteone-crawler/commit/2a7c49b415dd2864cc37497d409cb083abb99df5)
- analysis: added option to measure the duration and number of analysis steps + the analyzeVisitedUrl() method already accepts DOMDocument (if HTML) so the analyzers themselves do not have to do it twice [`d8b9a3d`](https://github.com/janreges/siteone-crawler/commit/d8b9a3d8e0016ec4cc6da908a1bd9db39370e9da)
- super table: calculated auto-width can't be shorter than column name (label) [`b97484f`](https://github.com/janreges/siteone-crawler/commit/b97484f22d59bee04b935fa204d18c609ba8658c)
- utils: removed ungreedy flag from all regular expressions, it caused problems under some circumstances [`03fc202`](https://github.com/janreges/siteone-crawler/commit/03fc202ed2f30fe4bd2001e8fcaecbea5ca45f7e)
- phpstan: fixed all level 5 issues [`04c21aa`](https://github.com/janreges/siteone-crawler/commit/04c21aaeeed24117740fac22b5756363e3a4769d)
- phpstan: fixed all level 4 issues [`91fee49`](https://github.com/janreges/siteone-crawler/commit/91fee49a0aefa603c4dba9bc1f19d658a7ab413e)
- phpstan: fixed all level 3 issues [`2f7866a`](https://github.com/janreges/siteone-crawler/commit/2f7866a389b05e3c796e7f1f0bd7f6410a23cb05)
- phpstan: fixed all level 2 issues [`e438996`](https://github.com/janreges/siteone-crawler/commit/e4389962be4a476bdcacc6acc18f36c7037b90ee)
- phpstan: installed phpstan with level 2 for now [`b896e6c`](https://github.com/janreges/siteone-crawler/commit/b896e6c0552e4fd938088594a7d44d6af14fc809)
- tests: allowed nextjs.org for crawling (incorrectly because of this, a couple of tests did not pass) [`cdc7f56`](https://github.com/janreges/siteone-crawler/commit/cdc7f5688f6aca0e822c3fa6daee6a3acd99eeeb)
- refactor: moved /Crawler/ into /src/Crawler/ + added file attachment support to mailer [`2f0d26c`](https://github.com/janreges/siteone-crawler/commit/2f0d26c7d2f7cb65495b375dd4b11bf7849888e2)
- sitemap exporter: renamed addErrorToSummary -&gt; addCriticalToSummary [`e46e192`](https://github.com/janreges/siteone-crawler/commit/e46e1926df52a3edfc4137ebd8ede9dee8a45bf1)
- text output: added options --show-inline-criticals and --show-inline-warning which displays the found problems directly under the URL - the displayed table will be less clear, but the problems are clearly visible [`725b212`](https://github.com/janreges/siteone-crawler/commit/725b2124172710895d86503fd4a933e2ea91efaa)
- composer.json: added require declarations for ext-dom, ext-libxml (used in analyzers) and ext-zlib (used in cache/storages) [`3542cf0`](https://github.com/janreges/siteone-crawler/commit/3542cf03829e9a3c745e58e0df1bc2f6284d25ba)
- analysis: added accessibility and best practices analyzers with useful checks [`860316f`](https://github.com/janreges/siteone-crawler/commit/860316fa685509104462412aeb125417dceaee28)
- analysis: added AnalysisManager for better analysis control with the possibility to filter required analyzers using --analyzer-filter-regex [`150569f`](https://github.com/janreges/siteone-crawler/commit/150569fd20c380781ed5971cefd47308762a730a)
- result storage: options --result-storage, --result-storage-dir and --result-storage-compression for storage of response bodies and headers (by default is used memory storage but you can use file storage for extremely large websites) [`d2a8fab`](https://github.com/janreges/siteone-crawler/commit/d2a8fabcef72067500dfcb0065e87ebc4395dac3)
- http cache: added --http-cache-dir and --http-cache-compression parameters (by default http cache is on and set to 'tmp/http-client-cache' and compression is disabled) [`2eb9ed8`](https://github.com/janreges/siteone-crawler/commit/2eb9ed86d9d53b4735a3de3cf6d06b652818dbc0)
- super table: the currentOrderColumn is already optional - sometimes we want to leave the table sorted according to the input array [`4fba880`](https://github.com/janreges/siteone-crawler/commit/4fba880fcf137a6207df4c5177cf3ec80afaa3ae)
- analysis: replaced severity ok/warning/error with ok/notice/warning/critical - it made more sense for analyzers [`18dbaa7`](https://github.com/janreges/siteone-crawler/commit/18dbaa7a4a760874ba39c75af28f7e808fb8eb2e)
- analysis: added support for immediate analysis of visited URLs with the possibility to insert the analyzer's own columns into the main table [`004865f`](https://github.com/janreges/siteone-crawler/commit/004865f223c9ec688c4f522cd8f93d8022458130)
- content types: fixed json/xml detection [`00fc180`](https://github.com/janreges/siteone-crawler/commit/00fc1808838c7a191cc9986e884ffda26f841281)
- content type analyzer: decreased URLs column size from 6 to 5 - that's enough [`2eefbaf`](https://github.com/janreges/siteone-crawler/commit/2eefbafad24f68118a2efe8d6ddedc4d3d45b5cf)
- formatting: unification of duration formatting across the entire application [`412ee7a`](https://github.com/janreges/siteone-crawler/commit/412ee7ab5c5eda19dfc5492a6cc9edbb7c5969c6)
- super table: fixed sorting for array of arrays [`4829be8`](https://github.com/janreges/siteone-crawler/commit/4829be8f8e1d3f0d8201dedfa99d245453601422)
- source domains analyzer: minor formatting improvements [`2d32ced`](https://github.com/janreges/siteone-crawler/commit/2d32cedb59aa13e4e27a1dbe58eff586e4407cd9)
- offline website exporter: added info about successful export to summary [`92e7e46`](https://github.com/janreges/siteone-crawler/commit/92e7e46bdbc1f1cff329cf4aff5ee99dd70332e2)
- help: added red message about invalid CLI parameters also to the end of help output, because help is already too long [`6942e8f`](https://github.com/janreges/siteone-crawler/commit/6942e8f4535d748763a124207634ea7548bbfa83)
- super table: added column property 'formatterWillChangeValueLength' to handle situation with the colored text and broken padding [`7371a68`](https://github.com/janreges/siteone-crawler/commit/7371a68f11191b0b21307e6ca703e362f476b815)
- analyzers: setting a more meaningful analyzers order [`5e8f747`](https://github.com/janreges/siteone-crawler/commit/5e8f747392f291abdfb0140038c42fe84801955c)
- analyzers: added source domains analyzer with summary of domains and downloaded content types (number/size/duration) [`f478f17`](https://github.com/janreges/siteone-crawler/commit/f478f178fb2f79a81e5db89909951816ac6e1c9f)
- super table: added auto-width column feature [`d2c04de`](https://github.com/janreges/siteone-crawler/commit/d2c04dec3312d72ed373236d73f7a4d3bbf8c20d)
- renaming: '--max-workers' to '--workers' with possibility to use shortcut '-w=&lt;num&gt;' + adding possibility to use shortcut '-rps=&lt;num&gt;' for '--max-reqs-per-sec=&lt;num&gt;' [`218f8ff`](https://github.com/janreges/siteone-crawler/commit/218f8ffcca15550853bcb4ace44dedf260d1e735)
- extra columns: added ability to force columns to the required length via "!" + refactoring using ExtraColumn [`def82ff`](https://github.com/janreges/siteone-crawler/commit/def82ff3f5f11efa2e4ef812e086a5c8379ac962)
- readme: divisionlit of features into several groups and divided accordingly [`c03d231`](https://github.com/janreges/siteone-crawler/commit/c03d2311b618f8aad165ffad39ae51989f60f846)
- offline exporter: export of the website to the offline form has already been fine-tuned (but not perfect yet), --disable-* options to disable JS/CSS/images/fonts/etc. and a lot of other related functionalities [`0d04a98`](https://github.com/janreges/siteone-crawler/commit/0d04a9805bdebea708eba44cc6680bd58995d559)
- crawler: added possibility to set speed via --max-reqs-per-sec (default 10) [`d57cc4a`](https://github.com/janreges/siteone-crawler/commit/d57cc4a39e6ce1882ee3233b015200382d90f06f)
- tests: dividing asserts for URL conversion testing into different detailed groups [`f6221cb`](https://github.com/janreges/siteone-crawler/commit/f6221cb5d3e5e844f146a95940479b20604c37cf)
- html url parser: added support for loading fonts from &lt;link href='...'&gt; [`4c482d1`](https://github.com/janreges/siteone-crawler/commit/4c482d1078fb535e4a3be96f6c3e7ded2ea02d65)
- manager: remove avif/webp support if OfflineWebsiteExporter is active - we want to use only long-supported jpg/png/gif on the local offline version [`3ec81d3`](https://github.com/janreges/siteone-crawler/commit/3ec81d338590ae16ee337cbbfa8a741e01b0522d)
- http response: transformation of the redirect to html with redirection through the &lt;meta&gt; tag [`8f6ff16`](https://github.com/janreges/siteone-crawler/commit/8f6ff161066a82af9ae91a738aae66327fe407b6)
- initiator: skip comments or empty arguments [`12f4c52`](https://github.com/janreges/siteone-crawler/commit/12f4c52b7fe0429926c2a6540e8842eae4882888)
- http client: added crawler signature to User-Agent and X-Crawler-Info header + added possibility to set Origin request header (otherwise some servers block downloading the fonts) [`ae4eaf3`](https://github.com/janreges/siteone-crawler/commit/ae4eaf3298e0bc94c1d913d08393426e380ba4ad)
- visited url: added isStaticFile() [`f1cd5e8`](https://github.com/janreges/siteone-crawler/commit/f1cd5e8e397b734dc3353db943c2928ff46cf520)
- crawler: increased pcre.backtrack_limit and pcre.recursion_limit (100x) to support longer HTML/CSS/JS [`35a6e9a`](https://github.com/janreges/siteone-crawler/commit/35a6e9a4729fffa7ee0a77b0be50621c4077a7b9)
- core options: renamed --headers-to-table to --extra-columns [`7c30988`](https://github.com/janreges/siteone-crawler/commit/7c30988fdecdaeb6aa89aed15a864a033c121d2f)
- crawler: added type for audio and xml + static cache for getContentTypeIdByContentTypeHeader [`386599e`](https://github.com/janreges/siteone-crawler/commit/386599e881051ae8c14b7ec9688690e50c0dd7dc)
- found urls: normalization of URL takes care of spaces + change of source type to int [`c3063a2`](https://github.com/janreges/siteone-crawler/commit/c3063a247f10bf00b8516eb2303bb85cab426c15)
- debugging: possibility to enable debugging through ParsedUrl [`979dc0e`](https://github.com/janreges/siteone-crawler/commit/979dc0e89af063b5ffe04b49275ceb0fa9191db2)
- offline url converter: class for solving the translation of URL addresses to offline/local + tests [`44118e6`](https://github.com/janreges/siteone-crawler/commit/44118e6bf96f6b25c7d8410084f76dfb3eb10188)
- url converter: TargetDomainRelation enum with tests [`fd6cf21`](https://github.com/janreges/siteone-crawler/commit/fd6cf216d903785adf46923ed2a805937f724d15)
- initiator: check only script basename in unknown args check [`888448f`](https://github.com/janreges/siteone-crawler/commit/888448fc9c598a7e8f750e746214b2834722b412)
- offline website export: to run the exporter is necessary to set --offline-export-directory [`33e9f95`](https://github.com/janreges/siteone-crawler/commit/33e9f952814b52bdfc7634cf4b9521d393b87417)
- offline website export: to run the exporter is necessary to set --offline-export-directory [`bcc007b`](https://github.com/janreges/siteone-crawler/commit/bcc007b6a3a9c0e9de23e76bd6f9150c7d2295c9)
- log & tmp: added .gitkeep for versioning of these folders - they are used by some optional features [`065f8ef`](https://github.com/janreges/siteone-crawler/commit/065f8ef27fabe889e8a35b98fd75ce260263d268)
- offline website export & tests: added the already well-functioning option to export the entire website to offline mode working from local static HTML files, including images, fonts, styles, scripts and other files (no documentation yet) + lot of related changes in Crawler + added first test testing some important functionalities about relative URL building [`4633211`](https://github.com/janreges/siteone-crawler/commit/463321199e6f9bac10b097e3f286da6a13f36906)
- composer & phpunit: added composer, phpunit and license CC BY 4.0 [`4979143`](https://github.com/janreges/siteone-crawler/commit/4979143ac2aea9d7b3fe9fcfb9d57f1890c1f114)
- visited-url: added info if is external and if is allowed to crawl it [`268a696`](https://github.com/janreges/siteone-crawler/commit/268a6960f8ff69046c8e6c73beae98d24b73ba1f)
- text-output: added peak memory usage and average traffic bandwidth to total stats [`cb68340`](https://github.com/janreges/siteone-crawler/commit/cb683407e2cdcd62f5484da96baf9ef43e49a4b3)
- crawler: added video support and fixed javascript detection by content-type [`3c3eb96`](https://github.com/janreges/siteone-crawler/commit/3c3eb9625f20657e971249c14cdff97a0a0b8687)
- url parsers: extraction of url parsing from html/css into dedicated classes and FoundUrl with info about source tag/attribute [`d87597d`](https://github.com/janreges/siteone-crawler/commit/d87597d36507c7bd6029f87bf1801586eea9b420)
- manager: ensure that done callback is executed only once [`d99cccd`](https://github.com/janreges/siteone-crawler/commit/d99cccd91b43680e0726f9c037fb568a9e8be1b4)
- http-client: extraction of http client functionality into dedicated classes and implemented cache for HTTP responses (critical for efficient development) [`8439e37`](https://github.com/janreges/siteone-crawler/commit/8439e376c50a346e133a2d99e7406020bb89030a)
- debugging: added debugging related expert options + Debugger class [`2c89682`](https://github.com/janreges/siteone-crawler/commit/2c89682feaf65a4f224da8ebaf05c48aa899eccc)
- parsed-url: added query, it is already needed [`860df08`](https://github.com/janreges/siteone-crawler/commit/860df086ae8c8556420d92e249b3b459b8bf288f)
- status: trim only HTML bodies because trim break some types of binary files, e.g. avif [`fca2156`](https://github.com/janreges/siteone-crawler/commit/fca2156a2f9607f705a32833a650ae70d5690772)
- url parsers: unification of extension length in relevant regexes to {1,10} [`96a3548`](https://github.com/janreges/siteone-crawler/commit/96a35484ba5ab0eee7e43837c1eade1aba6f8a57)
- basic-stats: fixed division by zero and nullable times [`8c38b96`](https://github.com/janreges/siteone-crawler/commit/8c38b9660752f132c09e3ceaab596e54176b46e9)
- fastest-analyzer: show only URLs with status 200 on the TOP list [`0085dd1`](https://github.com/janreges/siteone-crawler/commit/0085dd1fcbd3b5657eca73345921fe3fc6f407bc)
- content-type-analyzer: added stats for 42x statuses (429 Too many requests) [`4f49d12`](https://github.com/janreges/siteone-crawler/commit/4f49d124d1d9993abe3babd9a181c9768b5c2903)
- file export: fixed HTML report error after last refactoring [`e77fa6c`](https://github.com/janreges/siteone-crawler/commit/e77fa6cf791da08b522e2124545c303ab5de67ed)
- sitemap: publish only URLs with status 200 OK [`b2d4448`](https://github.com/janreges/siteone-crawler/commit/b2d44488a28aeca3421c36ca1e5ada0030de26d8)
- summary: added missing &lt;/ul&gt; and renamed heading Stats to Summary in HTML report [`c645e16`](https://github.com/janreges/siteone-crawler/commit/c645e16016611a49f70c3d5de9e6ab4d58a45048)
- status summary: added summary showing important analyzed metrics with OK/WARNING/CRITICAL icons, ordering by severity and INFO about the export execution + interrupting the script by CTRL+C will also run all analyzers, exporters and display all statistics for already processed URLs [`fd643d0`](https://github.com/janreges/siteone-crawler/commit/fd643d016036f4eed5418375f8b25cfe08549ed0)
- output consistency: ensuring color and formatting consistency of different types of values (status codes, request durations) [`3ffe1d2`](https://github.com/janreges/siteone-crawler/commit/3ffe1d2a939d718a6fae9c1f927646cfbec808f4)
- analyzers: added content-type analyzer with stats for total/avg times, total sizes and statuses 200x, 300x, 400x, 500x [`0475347`](https://github.com/janreges/siteone-crawler/commit/04753478bce1f81dfdab73cd19b0541e725317fe)
- crawler: better content-type handling for statistics and added 'Type' column to URL lists + refactored info from array to class [`346caf4`](https://github.com/janreges/siteone-crawler/commit/346caf45f3a18e75a0cf4d0e65961fbee63c9632)
- supertable: is now able to display from the array-of-arrays as well as from the array-of-objects + it can translate color declarations from bash to HTML colors when rendering to HTML [`80f0b1c`](https://github.com/janreges/siteone-crawler/commit/80f0b1ca3d50ee7dfae9a01eccbe15fcc06a72d5)
- analyzers: TOP slowest/fastest pages analyzer now evaluates only HTML pages, otherwise static content skews the results + decreased minTime for slowest analysis from 0.1 to 0.01 sec (on a very fast and cached website, the results were empty, which is not ideal) [`1390bbc`](https://github.com/janreges/siteone-crawler/commit/1390bbc6daa5484fed8612731dc99f734c406042)
- major refactoring: implementation of the Status class summarizing useful information for analyzers/exporters (replaces the JsonOutput over-use) + implementation of basic analyzers (404, redirects, slow/fast URLs) + SuperTable component that exports data to text and HTML + choice of memory-limit setting + change of some default values [`efb9a60`](https://github.com/janreges/siteone-crawler/commit/efb9a60aa0be5cb8af55b09723a236370fccb904)
- url parsing: fixes for cases when query params are used with htm/html/php/asp etc. + mini readme fix [`af1acfa`](https://github.com/janreges/siteone-crawler/commit/af1acfa9efa536d2ef2e51b2f0a2404ef9d2417a)
- minor refactoring: renaming about core options, small non-functional changes [`1dd258e`](https://github.com/janreges/siteone-crawler/commit/1dd258e81eb4d06658e5e41e62141d5be48ce622)
- major refactoring: better modularity and auto loading in the area of the exporters, analyzers, their configurability and help auto-building + new mailer options --mail-from-name and --mail-subject-template [`0c57dbd`](https://github.com/janreges/siteone-crawler/commit/0c57dbdb30702cc6669a703788b530fbc4d04af6)
- json output: automatic shortening of the URL according to the text width of the console, because if the long URL exceeds the width of the window, the rewriting of the line with the progressbar stops working properly [`106332b`](https://github.com/janreges/siteone-crawler/commit/106332b1d8421dbea5f8725536fa3efed6834564)
- manual exit: captures CTRL+C and ends with the statistics for at least the current URLs [`7f4fc80`](https://github.com/janreges/siteone-crawler/commit/7f4fc80c5f9f0fe47da2d9bee2e139489c36a966)
- error handling: show red error with help when queue or visited tables are full and info how to fix it [`4efbd73`](https://github.com/janreges/siteone-crawler/commit/4efbd734d775aaa2e6dd66d2d8ed7a007871a1dd)
- DOM elements: implemented DOM elements counter and when you add 'DOM' to --headers-to-column you will see DOM elements count [`1837a9c`](https://github.com/janreges/siteone-crawler/commit/1837a9cb12f97a33aec6bcf03a54250bd48545a2)
- sitemap and no-color: implemented xml/txt sitemap generator and --no-color option [`f9ade44`](https://github.com/janreges/siteone-crawler/commit/f9ade44d470d97bcc399039bc91a5ce74a6537c1)
- readme: added table of contents and rewrited intro, features and installation chapters [`469fd1c`](https://github.com/janreges/siteone-crawler/commit/469fd1cf15af4d191c239b2523e0fd8614f7653f)
- readme: removed deprecated and duplicate mailer docs [`c5effe8`](https://github.com/janreges/siteone-crawler/commit/c5effe84aece85f7a6aaa97228cd84a5eade4f8b)
- readme and CLI help: dividing the parameters into clear groups and improving parameters description - in README.md is detailed form, in CLI instructions is a shorter version. [`19ff724`](https://github.com/janreges/siteone-crawler/commit/19ff724ec0d21f08c4d6cf09def06ba27b023598)
- include/ignore regex: added option to limit crawled URLs with the common combination of --include-regex and --ignore-regex [`88e393d`](https://github.com/janreges/siteone-crawler/commit/88e393d33c07fab77173432fd0faf7fe631c2c2c)
- html report: masking passwords, styling, added logo, better info ordering and other small changes [`4cdcdab`](https://github.com/janreges/siteone-crawler/commit/4cdcdabf145ffe6f02d84b3250b2a1fc46a5677a)
- mailer & exports: implemented ability to send HTML report to e-mail via SMTP + exports to HTML/JSON/TXT file + better reporting of HTTP error conditions (timeout, etc.) + requests for assets are sent only as HEAD without the need to download all binary data + updated documentation [`a97c29d`](https://github.com/janreges/siteone-crawler/commit/a97c29d78f07b4d854853c474fb9d0542b6f2796)
- table output: option to set expected column length for better look by 'X-Cache(10)' [`e44f89d`](https://github.com/janreges/siteone-crawler/commit/e44f89d6c3114ccf02c70f38d5ffa5a0f081c1b2)
- output: renamed print*() methods to more meaningul add*() relevant also for JSON output [`1069c4a`](https://github.com/janreges/siteone-crawler/commit/1069c4a346d13878c52a316b5953ffa997ec3700)
- options: default timeout decreased from 10 to 3, --table-url-column-size renamed to --url-column-size and decreased its default value from 100 to 80, new option --hide-progress-bar, changed --truncate-url-to-column-size to --do-not-truncate-url [`e75038c`](https://github.com/janreges/siteone-crawler/commit/e75038c56afcf85ae591b1dbedf33a54fcd84754)
- readme: improved documentation describing use on Windows, macOS or arm64 Linux [`baf2d05`](https://github.com/janreges/siteone-crawler/commit/baf2d0596a3e8367d51fe6ab75793d803e984330)
- readme: added info about really tested crawler on Windows with Cygwin (Cygwin has some output limitations and it is not possible to achieve such nice behavior as on Linux) [`1f195c0`](https://github.com/janreges/siteone-crawler/commit/1f195c0c9c8565a37fcb5786070e69c6aa0b8e0e)
- windows compatibility: ensuring compatibility with running through cygwin Swoole, which I recommend in the documentation for Windows users [`c22cc45`](https://github.com/janreges/siteone-crawler/commit/c22cc4559ed3de2ac5e4e6e2957b4d3233b4fda5)
- json output: implemented nice continuos progress reporting, intentionally on STDERR so the output on STDOUT can be used to save JSON to file + improved README.md [`c095249`](https://github.com/janreges/siteone-crawler/commit/c095249d03c96a00da75553b10dadf7e025a5b0b)
- limits: increased limit of max queue length from 1000 to 2000 (this default will more suitable even for medium-sized websites) [`c8c3312`](https://github.com/janreges/siteone-crawler/commit/c8c33121c371cc4d0f0791a250178254d9e3a88a)
- major refactoring: splitting the code into classes, improving error handling and implementing other functions (JSON output, assets crawling) [`f6902fc`](https://github.com/janreges/siteone-crawler/commit/f6902fc025943ef96150739ae6834358097b235d)
- readme: added information how to use crawler with Windows, macOS or arm64 architecture + a few other details [`721f4bb`](https://github.com/janreges/siteone-crawler/commit/721f4bb73e92f65ca3aab789219f046dea665931)
- url parsing: handled situations when relative or dotted URLs are also used in HTML, e.g. href='sub/page', href='./sub/page' or href='../sub/page', href='../../sub/page' etc. + few minor optimizations [`c2bbf72`](https://github.com/janreges/siteone-crawler/commit/c2bbf72cf636340a43ebf8472c38008d0fc50f27)
- memory allocation: added optional params --max-queue-length=&lt;n&gt; (default 1000), --max-visited-urls=&lt;n&gt; (default 5000) and --max-url-length=&lt;u&gt; (default 2000) [`947a43f`](https://github.com/janreges/siteone-crawler/commit/947a43f3bb826ad852ca51390ae2778fbff320e0)
- Initial commit with first version 2023.10.1 [`7109788`](https://github.com/janreges/siteone-crawler/commit/71097884df3c1ade6fd7c02b4ac9ac8f5f161a12)
