


Option 1: You can set up a loop click to click through each of the thumbnails, then proceed to extract the full-sized image once loaded. There are two ways to extract the full-sized images from the thumbnails using Octoparse. The use of thumbnails substantially reduces bandwidth and loading time, making it much more friendly for people to browse through different products. I am sure you have seen something similar when you shop online or if you happen to run an online store.įor product images, thumbnail images are definitely the most common forms of image display.
#Octoparse download images code
I also noticed that it was essential to use “Scroll down for one screen” as opposed to “Scroll down to the bottom of the page.” This is mainly because the image URLs we need only get refreshed to the source code gradually. I did not come up with this setting randomly, but I did do a bit of fine-tunning to make sure this setting works. This basically means Octoparse will scroll down one screen for 40 times with 1 second between each scroll. Here I set scroll times = 40, interval=1 second, and scroll way = scroll down for one screen. Then, decide how many times to scroll as well, as at what pace. On the right side of the workflow, spot “Advanced options,” and check “Scroll down to the bottom of the page when finished loading.” We will need to add auto scroll, both when the website loads for the first time as well as when it paginates.Ĭlick on “Go to Webpage” from the workflow. Lucky for us, Octoparse does auto scroll easily. In other words, if the webpage is not scrolled down, we will not be able to get the corresponding image URLs from the source code. While debugging, I happened to notice that the HTML source code is being refreshed dynamically as one scrolls down the webpage. There’s just one more thing to tweak before running the crawler. Octoparse Scraping 101 may also help if you are new to the tool.Ĥ. It would be best if you are familiar with how Octoparse works in general. You will need the following tools: * Octoparse: a coding-free visual web scraping tool * Tab Save: a Chrome plugin to save images instantly upon providing a list of URLs Prerequisites
#Octoparse download images how to
In this tutorial, I will show you how to deal with each of these scenarios via a few examples. Some images can be fetched from the webpage directly, other images are triggered only by clicking the thumbnails.


No matter what reason you may need the pictures for - whether that be reblogging, reselling, or machine training - the same trick can be extended to literally any websites. Even if you have absolutely no tech background, you should be able to nail this within 30 minutes. In this article, I want to show you how to quickly build an image crawler without coding. Saving an image from a webpage is straightforward, simply right-click and select “save image as.” But what if you have hundreds or even thousands of images that need to be saved? Will the same trick work?
