Retrieving a Website using cURL

As mentioned amongst the anti-CAPTCHA / Cloudflare solution collection, here is a way to retrieve an URL using byparr:

curl \
  -X 'POST' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{ "cmd": "request.get", "url": "WEBSITE_URL", "max_timeout": 60}' \
  http://BYPARR_HOST:BYPARR_PORT/v1 | \
    jq -r 'select(.solution.status == 200).solution.response'

where:

WEBSITE_URL is the URL to the website protected by the CAPTCHA,
BYPARR_HOST is the hostname of the byparr service,
BYPARR_PORT is the port of the byparr service

Here jq is used to select a response iff. the HTTP response status from the website is 200 OK which means a successful request.

Note that contrary to other solutions, the request is sent to the byparr server and the actual site to be retrieved is passed as part of parameters that are sent to byparr.

Better Version for AARCH64/ARM64

Apparently it seems that there is a problem with recent byparr versions that results in timeouts on IoT boards such as the Rapsberry Pi or the Orange Pi, specifically when the board is based on an AARCH64/ARM64 CPU. The issue can be resolved by temporarily downgrading to an earlier version that has been reported to work well.

The following command should pull the better version:

docker pull ghcr.io/thephaseless/byparr:e745e118b9ce6217a76b87143e7743ba48a49e92-arm64

and it should be referenced instead of the latest image when running byparr, for example:

docker run \
  --rm \
  --name byparr \
  -p 8191:9191 \
  ghcr.io/thephaseless/byparr:e745e118b9ce6217a76b87143e7743ba48a49e92-arm64