Octoparse

Before integrating a proxy into Octoparse, make sure you have set up the proxy correctly to guarantee that the proxy service will work properly.

1. Open the Octoparse application.

2. In the upper left menu, hover over the New button and click Advanced Mode. For testing purposes, we will create a custom task.

3. In the Website field, indicate the website from which you want to extract data. Then click the Save button.

4. You should now find yourself in the Tasks tab. To configure our proxy, select the Settings button.

5. In the popup menu, scroll down to Anti-blocking settings and check the Use IP proxies option. You should now be able to click the Settings button.

6. In the Proxy Settings pop-up window, define the proxy to use. Since Octoparse only provides format-based authentication of the proxy network, you will need to use our whitelist IP feature to skip traditional authentication when passing through the proxy.

7. Once IP:PORT is ready, select a rotation interval based on your session type. If you are using a rotating session type, set the interval to 1. If you are using a sticky session, set it to 600. Finally, click the OK button.

8. To verify that everything is working correctly, look for a check mark next to the Settings option under Anti-blocking Settings. Once confirmed, click Save to continue.

9. To extract data from our sample page, click on the IP address you can see at the top of the Octoparse application and select Extract text of the selected element.

10. When finished, click Save and then Run.

11. Depending on how you want to run the task, select one of the available extraction options. For testing purposes, you can run the task on the device.

12. If done correctly, you should see our proxy IP in the extracted data table after the task is completed.

Through the above steps, you have successfully completed the integration of Octoparse and proxy, which provides a more secure and flexible option for network connection.

Was this helpful?